<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ Benchmark - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ Benchmark - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Tue, 23 Jun 2026 21:15:34 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/benchmark/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Choose the Right LLM for Your Projects: A Guide to Effective Model Benchmarking ]]>
                </title>
                <description>
                    <![CDATA[ When you start building with LLMs, it quickly becomes clear that not all models behave the same. One model may excel at creative writing but struggle with technical precision. Another might be thoughtful yet verbose. A third could be fast and efficie... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/choose-the-right-llm-for-your-projects-benchmarking-guide/</link>
                <guid isPermaLink="false">690e27b0ca4947acbefbd755</guid>
                
                    <category>
                        <![CDATA[ llm ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Benchmark ]]>
                    </category>
                
                    <category>
                        <![CDATA[ evaluation metrics ]]>
                    </category>
                
                    <category>
                        <![CDATA[ benchmarking ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Surya Teja Appini ]]>
                </dc:creator>
                <pubDate>Fri, 07 Nov 2025 17:09:04 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1762534383880/404f27c6-2995-4daa-bcac-c61b10e93abc.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>When you start building with LLMs, it quickly becomes clear that not all models behave the same. One model may excel at creative writing but struggle with technical precision. Another might be thoughtful yet verbose. A third could be fast and efficient yet less consistent. So how do you choose the right one for your task?</p>
<p>This guide walks you through a comprehensive workflow for evaluating and selecting the best LLM for your needs. It’s designed for developers who want more than API demos. You’ll see how to design, test, and compare models using real examples and meaningful metrics.</p>
<p>By the end, you’ll understand not only <em>how</em> to benchmark models but <em>why</em> each step matters.</p>
<h2 id="heading-table-of-contents"><strong>Table of Contents</strong></h2>
<ul>
<li><p><a class="post-section-overview" href="#heading-why-public-benchmarks-arent-enough">Why Public Benchmarks Aren’t Enough</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-1-define-the-task-and-metrics">Step 1: Define the Task and Metrics</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-2-prepare-the-data-and-generate-outputs">Step 2: Prepare the Data and Generate Outputs</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-3-automate-evaluation-with-a-judge-llm">Step 3: Automate Evaluation with a Judge LLM</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-4-analyze-visualize-and-interpret">Step 4: Analyze, Visualize, and Interpret</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-step-5-iterate-and-scale-up">Step 5: Iterate and Scale Up</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-preparing-a-test-dataset">Preparing a Test Dataset</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-cloud-providers-and-apis-for-llm-access">Cloud Providers and APIs for LLM Access</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-common-pitfalls-to-avoid">Common Pitfalls to Avoid</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-how-to-run-this-end-to-end">How to Run This End to End</a></p>
</li>
<li><p><a class="post-section-overview" href="#heading-conclusion-turning-evaluation-into-insight">Conclusion: Turning Evaluation into Insight</a></p>
</li>
</ul>
<h2 id="heading-why-public-benchmarks-arent-enough">Why Public Benchmarks Aren’t Enough</h2>
<p>Public leaderboards like MMLU, HumanEval, and HellaSwag show how well models perform on general tests, but they don’t reflect the nuances of your real-world application. A model that scores 90% on reasoning might still fail to produce factual or brand-aligned answers for your domain.</p>
<p>For example, if you’re building a customer review summarizer, your goal isn’t just correctness, it’s tone, style, and reliability. You might value concise responses with minimal hallucination over creative but inconsistent writing.</p>
<p>That’s why you need a <strong>custom benchmark</strong> that mirrors your actual inputs and quality expectations.</p>
<h2 id="heading-step-1-define-the-task-and-metrics">Step 1: Define the Task and Metrics</h2>
<p>To start, you’ll want to decide up front what success looks like for your application by translating a product need (for example, a short, factual review summarizer) into measurable criteria such as accuracy, factuality, conciseness, latency, and cost. Clear, specific goals make the rest of the pipeline meaningful and comparable.</p>
<p><strong>Example task:</strong> Summarize user reviews into concise, factual one-liners.</p>
<h3 id="heading-key-metrics">Key Metrics</h3>
<ul>
<li><p><strong>Accuracy:</strong> Does the summary reflect the correct information?</p>
</li>
<li><p><strong>Factuality:</strong> Does it avoid hallucinations?</p>
</li>
<li><p><strong>Conciseness:</strong> Is it short yet meaningful?</p>
</li>
<li><p><strong>Latency:</strong> How long does it take per query?</p>
</li>
<li><p><strong>Cost:</strong> How much do API tokens add up per 1,000 requests?</p>
</li>
</ul>
<p>These metrics will help balance technical trade-offs and real-world constraints.</p>
<h2 id="heading-step-2-prepare-the-data-and-generate-outputs">Step 2: Prepare the Data and Generate Outputs</h2>
<p>Now, we’ll build a small but representative test set and generate candidate outputs from each model you plan to evaluate. The purpose is to create comparable inputs and collect the raw outputs you will later score and analyze.</p>
<p><strong>Requirements:</strong> Python 3.9+ and <code>pandas</code> installed (<code>pip install pandas</code>).</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> pandas <span class="hljs-keyword">as</span> pd

reviews = [
    <span class="hljs-string">"The camera quality is great but the battery dies fast."</span>,
    <span class="hljs-string">"Love the design and performance, but it's overpriced."</span>,
    <span class="hljs-string">"Fast processor, poor sound quality, average screen."</span>
]
references = [
    <span class="hljs-string">"Good camera, poor battery."</span>,
    <span class="hljs-string">"Excellent design but expensive."</span>,
    <span class="hljs-string">"Fast but weak audio and display."</span>
]

<span class="hljs-comment"># Build a tiny DataFrame for quick iteration</span>
df = pd.DataFrame({<span class="hljs-string">"review"</span>: reviews, <span class="hljs-string">"reference"</span>: references})
print(<span class="hljs-string">"Sample data:"</span>)
print(df.head())  <span class="hljs-comment"># sanity check: confirm shape/columns</span>
</code></pre>
<p>Now, generate responses using multiple LLMs through <strong>OpenRouter</strong>, which unifies different APIs into one.</p>
<p><strong>Requirements:</strong> OpenRouter API key set as <code>YOUR_KEY</code>, <code>openrouter</code> Python client installed (<code>pip install openrouter</code>), and access to the models you plan to test.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> openrouter
<span class="hljs-keyword">import</span> time

<span class="hljs-comment"># Initialize API client</span>
client = openrouter.Client(api_key=<span class="hljs-string">"YOUR_KEY"</span>)

<span class="hljs-comment"># Replace these placeholders with whichever providers/models you can access</span>
models = [<span class="hljs-string">"model-A"</span>, <span class="hljs-string">"model-B"</span>, <span class="hljs-string">"model-C"</span>]

results = {}
<span class="hljs-keyword">for</span> model <span class="hljs-keyword">in</span> models:
    print(<span class="hljs-string">f"Evaluating <span class="hljs-subst">{model}</span>..."</span>)
    start = time.time()
    outputs = []
    <span class="hljs-keyword">for</span> review <span class="hljs-keyword">in</span> reviews:
        <span class="hljs-comment"># Keep the prompt identical across models to reduce bias</span>
        res = client.completions.create(
            model=model,
            messages=[{<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: <span class="hljs-string">f"Summarize this review: <span class="hljs-subst">{review}</span>"</span>}]
        )
        outputs.append(res.choices[<span class="hljs-number">0</span>].message.content.strip())
    <span class="hljs-comment"># Store both the raw outputs and a coarse latency figure</span>
    results[model] = {<span class="hljs-string">"outputs"</span>: outputs, <span class="hljs-string">"latency"</span>: time.time() - start}

print(<span class="hljs-string">"Model outputs generated."</span>)
</code></pre>
<p><strong>Tip:</strong> Even a handful of examples per model can reveal consistent behavior patterns.</p>
<h2 id="heading-step-3-automate-evaluation-with-a-judge-llm">Step 3: Automate Evaluation with a Judge LLM</h2>
<p>In this step, our goal is to replace slow, inconsistent manual labeling with a repeatable, programmatic judging step. We’ll use a fixed judge model and a short rubric so you can get machine-readable scores that reflect qualitative criteria like tone, clarity, and factuality.</p>
<p>Before we go on, let’s clear something up: what is a model-as-a-judge? A model-as-a-judge (MAAJ) uses one LLM to grade outputs from another LLM against task-specific criteria. By prompting the judge with a clear, consistent rubric, you get structured scores that are repeatable and machine-readable. This is useful for aggregation, tracking, and visualization.</p>
<p>We use a fixed rubric because it minimizes drift between runs, and JSON because it makes the output easy to parse and analyze programmatically.</p>
<p>Here are some tips for reliable judging:</p>
<ul>
<li><p>Use a judge model that follows instructions well and keep it fixed across evaluation runs.</p>
</li>
<li><p>Calibrate the rubric: start with 2–4 criteria and a simple numeric scale (for example, 1–5).</p>
</li>
<li><p>Avoid self-judging: prefer a judge from a different provider or model family where possible to reduce shared biases.</p>
</li>
<li><p>For tie-breakers or fine-grained comparisons, consider pairwise judgments (ask the judge to pick the better of two candidates) and convert preferences into scores.</p>
</li>
</ul>
<p><strong>Requirements:</strong> An API key for your judge provider and the official SDK (for example <code>pip install openai</code>).</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> openai <span class="hljs-keyword">import</span> OpenAI
<span class="hljs-keyword">import</span> json

client = OpenAI(api_key=<span class="hljs-string">"YOUR_API_KEY"</span>)  <span class="hljs-comment"># or load from env var</span>

<span class="hljs-comment"># Clear rubric keeps the judge consistent across runs</span>
PROMPT = <span class="hljs-string">"""
You are grading summaries on a scale of 1-5 for:
1. Correctness (alignment with the reference)
2. Conciseness (brevity and clarity)
3. Helpfulness (coverage of key points)
Return a JSON object with the scores.
"""</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">evaluate</span>(<span class="hljs-params">candidate, reference</span>):</span>
    <span class="hljs-comment"># Provide both reference and candidate to the judge</span>
    msg = <span class="hljs-string">f"Reference: <span class="hljs-subst">{reference}</span>\nCandidate: <span class="hljs-subst">{candidate}</span>"</span>
    response = client.chat.completions.create(
        model=<span class="hljs-string">"judge-model"</span>,  <span class="hljs-comment"># keep the judge fixed for fair comparisons</span>
        messages=[
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>, <span class="hljs-string">"content"</span>: PROMPT},
            {<span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>, <span class="hljs-string">"content"</span>: msg}
        ]
    )
    <span class="hljs-comment"># Judge returns JSON; parse into a Python dict</span>
    <span class="hljs-keyword">return</span> json.loads(response.choices[<span class="hljs-number">0</span>].message.content)
</code></pre>
<p>In the code above,</p>
<ul>
<li><p>The rubric in <code>PROMPT</code> defines scoring dimensions (for example: correctness, conciseness, helpfulness). The judge is instructed to return a JSON object.</p>
</li>
<li><p>For each candidate and its reference, the judge receives both strings and applies the rubric.</p>
</li>
<li><p>The judge’s JSON output is parsed with <code>json.loads(...)</code> and aggregated per model to compute averages or distributions.</p>
</li>
</ul>
<p>You can loop through models to automatically gather structured scores.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> statistics

<span class="hljs-keyword">for</span> model, data <span class="hljs-keyword">in</span> results.items():
    scores = [evaluate(cand, ref) <span class="hljs-keyword">for</span> cand, ref <span class="hljs-keyword">in</span> zip(data[<span class="hljs-string">"outputs"</span>], references)]
    results[model][<span class="hljs-string">"scores"</span>] = scores

    avg = {k: statistics.mean([s[k] <span class="hljs-keyword">for</span> s <span class="hljs-keyword">in</span> scores]) <span class="hljs-keyword">for</span> k <span class="hljs-keyword">in</span> scores[<span class="hljs-number">0</span>]}
    print(<span class="hljs-string">f"\n<span class="hljs-subst">{model}</span> Average Scores:"</span>)
    <span class="hljs-keyword">for</span> k, v <span class="hljs-keyword">in</span> avg.items():
        print(<span class="hljs-string">f"  <span class="hljs-subst">{k}</span>: <span class="hljs-subst">{v:<span class="hljs-number">.2</span>f}</span>"</span>)
</code></pre>
<h2 id="heading-step-4-analyze-visualize-and-interpret">Step 4: Analyze, Visualize, and Interpret</h2>
<p>Our goal here is to turn raw numbers and judge scores into actionable insight. Visualization exposes trade-offs (cost vs. quality vs. latency), highlights variance and outliers, and helps you pick the model that best fits your constraints.</p>
<p>What to visualize and why:</p>
<ul>
<li><p>Latency bars: compare average response time per model. Good for quick performance triage.</p>
</li>
<li><p>Cost bars: cost per 1,000 requests. Makes budget trade-offs visible.</p>
</li>
<li><p>Quality distributions: box plots or histograms of judge scores. Shows variance and outliers.</p>
</li>
<li><p>Quality vs cost scatter: quickly surfaces Pareto-efficient choices.</p>
</li>
<li><p>Confusion matrices: for classification tasks. Shows where models disagree with ground truth.</p>
</li>
<li><p>Radar charts: helpful when comparing 3 to 6 metrics across models at once.</p>
</li>
</ul>
<p>The code below builds a simple bar chart from a <code>results</code> dictionary: <code>models_list</code> provides x-axis labels and <code>latencies</code> maps to the bar heights in seconds. You can replicate the pattern for cost or judge-based scores by swapping the y-values.</p>
<p><strong>Requirements:</strong> <code>matplotlib</code> installed (<code>pip install matplotlib</code>).</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt

latencies = [results[m][<span class="hljs-string">'latency'</span>] <span class="hljs-keyword">for</span> m <span class="hljs-keyword">in</span> results]
models_list = list(results.keys())

plt.bar(models_list, latencies)  <span class="hljs-comment"># simple bar chart; add styling if needed</span>
plt.title(<span class="hljs-string">'Model Latency Comparison'</span>)
plt.ylabel(<span class="hljs-string">'Seconds'</span>)
plt.show()
</code></pre>
<p>The chart is embedded here for reference:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762499865751/d3ebd40e-e8f4-44c8-b953-612d1604a8ca.png" alt="Bar chart comparing model latency." class="image--center mx-auto" width="1979" height="1180" loading="lazy"></p>
<p>Figure: Model latency comparison (seconds per batch).</p>
<p><strong>Reflection Question:</strong> Which metric matters most for your use case: accuracy, speed, or cost?</p>
<h2 id="heading-step-5-iterate-and-scale-up">Step 5: Iterate and Scale Up</h2>
<p>At this point, we’ll move from small experiments to a repeatable, automated evaluation pipeline that can run at scale, track regression, and integrate with monitoring and CI. This step is about operationalizing the evaluation so you can confidently detect when a model update helps or harms your product.</p>
<p>Evaluation flow (high level):</p>
<ol>
<li><p><strong>Dataset (JSONL)</strong>: a versioned test set with metadata (category, difficulty).</p>
</li>
<li><p><strong>Prompt templates</strong>: standardized prompts or templates applied uniformly across models.</p>
</li>
<li><p><strong>Model runners</strong>: parallel execution across a pool of models (cloud APIs or local hosts).</p>
</li>
<li><p><strong>Judge + Metrics</strong>: compute structured scores (judge JSON) and classical metrics (accuracy, F1).</p>
</li>
<li><p><strong>Storage &amp; dashboards</strong>: persist results, visualize trends, alert on regressions.</p>
</li>
</ol>
<p>Having this explicit flow helps you choose tooling. Below are two representative frameworks and how they map to the flow so you can see which stages they help with.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1762499923039/a8fb6a38-5046-488e-8c15-3783c8d5dab9.png" alt="Circular diagram showing Data feeding Models, Models feeding Judge, Judge producing Insights, Insights leading to Refinement, and Refinement feeding back into Data." class="image--center mx-auto" width="2339" height="284" loading="lazy"></p>
<p>Figure: Evaluation pipeline – Data → Models → Judge → Insights → Refinement → Data</p>
<p>Some examples that map to the flow are:</p>
<ul>
<li><p><strong>AWS FMEval</strong>: focuses on large-scale evaluation and experiment tracking. It covers dataset adapters, parallel model runners, built-in metrics, and native integration with AWS experiment storage and dashboards. Use it when your data lives on cloud storage and you want tight Bedrock or AWS integration for production evaluation runs.</p>
</li>
<li><p><strong>LangChain Eval</strong>: focuses on tight integration with application pipelines. It covers prompt templates, judge and metric hooks, and easy programmatic evaluators that plug directly into LangChain-based model runners. Use it when your evaluation should be embedded in development pipelines or when you already use LangChain for orchestration.</p>
</li>
</ul>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> fmeval <span class="hljs-keyword">import</span> DataConfig, ModelRunner, EvaluationSet

cfg = DataConfig(dataset_uri=<span class="hljs-string">"s3://your-dataset/reviews.jsonl"</span>)  <span class="hljs-comment"># JSONL test set</span>
runner = ModelRunner(model_id=<span class="hljs-string">"model-id"</span>)       <span class="hljs-comment"># pick a model to evaluate</span>
eval_set = EvaluationSet(config=cfg, runner=runner)
<span class="hljs-comment"># Run evaluation with a simple metric; swap in your custom metric as needed</span>
eval_set.evaluate(metric=<span class="hljs-string">"accuracy"</span>)
<span class="hljs-comment"># Persist results for dashboards or regression tracking</span>
eval_set.save(<span class="hljs-string">"./results.json"</span>)
</code></pre>
<p>You’ll want to run scheduling and drift tracking regularly – for example, nightly or weekly evals on a fixed test set. Send an alert when a model update drops a score or increases latency beyond a threshold.</p>
<h2 id="heading-preparing-a-test-dataset">Preparing a Test Dataset</h2>
<p>A well-prepared test dataset is the foundation of reliable model evaluation. Here are a few best practices, followed by a concrete example:</p>
<ul>
<li><p>Reflect real use cases: use authentic data from your domain such as customer queries, logs, or user reviews.</p>
</li>
<li><p>Diversify examples: include easy, typical, and edge-case scenarios to measure robustness.</p>
</li>
<li><p>Expert annotation: have domain experts provide clear reference outputs or ground truth labels.</p>
</li>
<li><p>Keep it separate: ensure the test dataset is not reused from training or fine-tuning.</p>
</li>
<li><p>Update regularly: add new examples to reflect changing user behavior or data drift.</p>
</li>
<li><p>Version everything: track dataset versions, annotation changes, and evaluation notes.</p>
</li>
<li><p>Quality over quantity: start small but ensure examples are accurate and representative.</p>
</li>
</ul>
<h3 id="heading-small-jsonl-test-set">Small JSONL test set</h3>
<p>Create a line-delimited JSON (JSONL) file where each line is a JSON object with two required fields: <code>input</code> (the prompt) and <code>reference</code> (the expected output). This simple, tooling-friendly format is accepted by most evaluation frameworks and is easy to version, diff, and slice.</p>
<p>Optionally add metadata fields such as <code>category</code>, <code>difficulty</code>, or <code>source</code> to enable filtered analysis and targeted slicing during evaluation.</p>
<pre><code class="lang-python">{<span class="hljs-string">"input"</span>: <span class="hljs-string">"The camera quality is great but the battery dies fast."</span>, <span class="hljs-string">"reference"</span>: <span class="hljs-string">"Good camera, poor battery."</span>}
{<span class="hljs-string">"input"</span>: <span class="hljs-string">"Love the design and performance, but it's overpriced."</span>, <span class="hljs-string">"reference"</span>: <span class="hljs-string">"Excellent design but expensive."</span>}
{<span class="hljs-string">"input"</span>: <span class="hljs-string">"Fast processor, poor sound quality, average screen."</span>, <span class="hljs-string">"reference"</span>: <span class="hljs-string">"Fast but weak audio and display."</span>}
</code></pre>
<p>Helper script to produce JSONL:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> json
samples = [
    {<span class="hljs-string">"input"</span>: <span class="hljs-string">"The camera quality is great but the battery dies fast."</span>, <span class="hljs-string">"reference"</span>: <span class="hljs-string">"Good camera, poor battery."</span>},
    {<span class="hljs-string">"input"</span>: <span class="hljs-string">"Love the design and performance, but it's overpriced."</span>, <span class="hljs-string">"reference"</span>: <span class="hljs-string">"Excellent design but expensive."</span>},
    {<span class="hljs-string">"input"</span>: <span class="hljs-string">"Fast processor, poor sound quality, average screen."</span>, <span class="hljs-string">"reference"</span>: <span class="hljs-string">"Fast but weak audio and display."</span>}
]
<span class="hljs-keyword">with</span> open(<span class="hljs-string">"reviews_test.jsonl"</span>, <span class="hljs-string">"w"</span>) <span class="hljs-keyword">as</span> f:
    <span class="hljs-keyword">for</span> row <span class="hljs-keyword">in</span> samples:
        f.write(json.dumps(row) + <span class="hljs-string">"\n"</span>)
print(<span class="hljs-string">"Wrote reviews_test.jsonl"</span>)
</code></pre>
<p>You can add fields like <code>category</code> or <code>difficulty</code> to filter and slice results later.</p>
<p>Even a compact, well-designed test set can highlight major model differences and guide better deployment decisions.</p>
<h2 id="heading-cloud-providers-and-apis-for-llm-access">Cloud Providers and APIs for LLM Access</h2>
<p>Before you can benchmark different large language models, you need reliable ways to access them. Most LLMs are hosted behind APIs or cloud platforms that expose standard interfaces for sending prompts and receiving outputs. Choosing the right provider affects not only <em>which</em> models you can test, but also your results for latency, throughput, and cost.</p>
<p>Now, we’ll look at some of the main options for accessing LLMs. These range from commercial APIs like OpenAI and Anthropic, to open-source options like Hugging Face, and enterprise platforms like AWS Bedrock and Azure OpenAI.</p>
<p>Understanding these platforms will help you design realistic benchmarks that reflect the infrastructure you’ll actually deploy in production.</p>
<ul>
<li><p><strong>OpenAI and Anthropic:</strong> Reliable APIs offering strong reasoning and creative models.</p>
</li>
<li><p><strong>Google Gemini and Cohere:</strong> Strong multimodal and enterprise-friendly options.</p>
</li>
<li><p><strong>OpenRouter:</strong> Simplifies access to multiple providers with a single API key.</p>
</li>
<li><p><strong>Hugging Face:</strong> Great for open-source experimentation and deployment flexibility.</p>
</li>
<li><p><strong>AWS Bedrock and Azure OpenAI:</strong> Enterprise-grade platforms with security, compliance, and scalability.</p>
</li>
</ul>
<p>Use a unified testing approach for flexible experiments and a production cloud provider when you need compliance and scalability.</p>
<p>Once you’ve decided where to source your models, you can run consistent benchmarks across providers using a unified API interface. This helps make sure your comparisons reflect real deployment conditions.</p>
<h2 id="heading-common-pitfalls-to-avoid">Common Pitfalls to Avoid</h2>
<p>Below are five common mistakes, why they matter, and what to do instead. Keep this checklist handy when designing experiments or reviewing results.</p>
<p><strong>1. Using the same model as both generator and judge</strong><br>Shared biases inflate scores and hide errors. Instead, you can use a separate judge (different provider, family, or size) and keep the judge fixed across runs.</p>
<p><strong>2. Relying only on aggregate numbers</strong><br>Averages hide tone, factuality issues, and edge-case failures. Instead, you should maintain a curated error-analysis set and do periodic manual spot checks.</p>
<p><strong>3. Ignoring latency and cost</strong><br>A high-scoring model may be too slow or expensive for production SLAs. Instead, you can track latency distributions and projected monthly cost alongside quality metrics.</p>
<p><strong>4. Not versioning datasets or prompts</strong><br>Silent changes break comparability and reproducibility. Make sure you store datasets and prompt templates in version control and log run metadata and data hashes for every evaluation.</p>
<p><strong>5. Overfitting to the test set</strong><br>Repeated tuning on a tiny set undermines generalization. Instead, keep a holdout set, rotate or refresh samples, and expand the dataset over time.</p>
<h2 id="heading-conclusion-turning-evaluation-into-insight">Conclusion: Turning Evaluation into Insight</h2>
<p>Benchmarking helps you score models as well as understand them. Through this workflow, you’ve seen how to:</p>
<ol>
<li><p>Define tasks and meaningful metrics.</p>
</li>
<li><p>Generate model outputs programmatically.</p>
</li>
<li><p>Evaluate using a judge model for consistency.</p>
</li>
<li><p>Visualize trade-offs to make data-driven choices.</p>
</li>
</ol>
<p>As models evolve, your benchmarking pipeline becomes a living system. It helps you track progress, validate improvements, and justify decisions with evidence.</p>
<p>Choosing an LLM is no longer guesswork. It’s now a structured experiment grounded in real data. Each iteration builds intuition and confidence. Over time, you’ll know not just which model performs best, but <em>why</em>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to Write Benchmark Tests for Your Golang Functions ]]>
                </title>
                <description>
                    <![CDATA[ Hello Gophers 👋 Let me start by asking you a question: How would you test the performance of a piece of code or a function in Go? Well, you could use benchmark tests. In this tutorial, I will show you how to use an awesome benchmarking tool that’s b... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-write-benchmark-tests-for-your-golang-functions/</link>
                <guid isPermaLink="false">66f17d16371c67381d509aea</guid>
                
                    <category>
                        <![CDATA[ golang ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Golang developer ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Testing ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Benchmark ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Pedro ]]>
                </dc:creator>
                <pubDate>Mon, 23 Sep 2024 14:37:10 +0000</pubDate>
                <media:content url="https://cdn.hashnode.com/res/hashnode/image/upload/v1726668982641/58540086-9f98-4ac9-8c8a-84ef45e27875.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Hello Gophers 👋</p>
<p>Let me start by asking you a question: How would you test the performance of a piece of code or a function in Go? Well, you could use <strong>benchmark</strong> tests.</p>
<p>In this tutorial, I will show you how to use an awesome benchmarking tool that’s built into the Golang testing package.</p>
<p>Let’s go.</p>
<h2 id="heading-what-are-benchmark-tests">What Are Benchmark Tests?</h2>
<p>In Go, <a target="_blank" href="https://pkg.go.dev/testing#hdr-Benchmarks">benchmark tests</a> are used to measure the performance (speed and memory usage) of functions or blocks of code. These tests are part of the Go testing framework and are written in the same files as unit tests, but they are specifically for performance analysis.</p>
<h2 id="heading-example-use-case-fibonacci-sequence">Example Use Case: Fibonacci Sequence</h2>
<p>For this example, I'll be using the classic Fibonacci Sequence, which is determined by:</p>
<pre><code class="lang-plaintext">if (x &lt; 2) 
   F(0) = 1
   F(2) = 2
else 
   F(x) = F(x-1) + F(x-2)

In practice, the sequence is:
1, 1, 2, 3, 5, 8, 13, etc.
</code></pre>
<p>This sequence is important because it appears in various parts of mathematics and nature as well, as shown below:</p>
<p><img src="https://dev-to-uploads.s3.amazonaws.com/uploads/articles/v6fqdlmiqjob46joyfpz.png" alt="Fibonacci sequence in a spiral (like a snail shell)" width="1280" height="806" loading="lazy"></p>
<p>There are several ways to implement this code, and I'll be picking two of them for our benchmark testing: the recursive and iterative methods. The main objective of the functions is to provide a <em>position</em> and return the Fibonacci number at that position.</p>
<h3 id="heading-recursive-method">Recursive Method</h3>
<pre><code class="lang-go"><span class="hljs-comment">// main.go</span>
<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">fibRecursive</span><span class="hljs-params">(n <span class="hljs-keyword">uint</span>)</span> <span class="hljs-title">uint</span></span> {
    <span class="hljs-keyword">if</span> n &lt;= <span class="hljs-number">2</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>
    }
    <span class="hljs-keyword">return</span> fibRecursive(n<span class="hljs-number">-1</span>) + fibRecursive(n<span class="hljs-number">-2</span>)
}
</code></pre>
<p>The function above is a recursive implementation of calculating the Fibonacci sequence. Now I’ll break it down step by step for you as a beginner in Go.</p>
<p>Here’s your function for calculating the Fibonacci numbers:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">fibRecursive</span><span class="hljs-params">(n <span class="hljs-keyword">uint</span>)</span> <span class="hljs-title">uint</span></span> {
    <span class="hljs-keyword">if</span> n &lt;= <span class="hljs-number">2</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>
    }
    <span class="hljs-keyword">return</span> fibRecursive(n<span class="hljs-number">-1</span>) + fibRecursive(n<span class="hljs-number">-2</span>)
}
</code></pre>
<h4 id="heading-1-function">1. <strong>Function:</strong></h4>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">fibRecursive</span><span class="hljs-params">(n <span class="hljs-keyword">uint</span>)</span> <span class="hljs-title">uint</span></span>
</code></pre>
<ul>
<li><p><code>func</code>: This keyword defines a function in Go.</p>
</li>
<li><p><code>fibRecursive</code>: This is the name of the function. It’s called <code>fibRecursive</code> because it calculates Fibonacci numbers using recursion.</p>
</li>
<li><p><code>n uint</code>: The function takes a single argument, <code>n</code>, which is of type <code>uint</code> (an unsigned integer). This represents the position of the Fibonacci sequence that we want to calculate.</p>
</li>
<li><p><code>uint</code>: The function returns a <code>uint</code> (unsigned integer) because Fibonacci numbers are non-negative integers.</p>
</li>
</ul>
<h4 id="heading-2-base-stage">2. <strong>Base Stage:</strong></h4>
<pre><code class="lang-go"><span class="hljs-keyword">if</span> n &lt;= <span class="hljs-number">2</span> {
    <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>
}
</code></pre>
<ul>
<li><p>The <code>if</code> statement checks if <code>n</code> is less than or equal to 2.</p>
</li>
<li><p>In the Fibonacci sequence, the 1st and 2nd numbers are both 1. So, if <code>n</code> is 1 or 2, the function returns 1.</p>
</li>
<li><p>This is called the <strong>base stage,</strong> and it stops the recursion from going infinitely deep.</p>
</li>
</ul>
<h4 id="heading-3-recursive-stage">3. <strong>Recursive Stage:</strong></h4>
<pre><code class="lang-go"><span class="hljs-keyword">return</span> fibRecursive(n<span class="hljs-number">-1</span>) + fibRecursive(n<span class="hljs-number">-2</span>)
</code></pre>
<ul>
<li><p>If <code>n</code> is greater than 2, the function calls itself twice:</p>
<ul>
<li><p><code>fibRecursive(n-1)</code>: This will calculate the Fibonacci number for the position just before <code>n</code>.</p>
</li>
<li><p><code>fibRecursive(n-2)</code>: This will calculate the Fibonacci number for two positions before <code>n</code>.</p>
</li>
</ul>
</li>
<li><p>The function then adds these two results together, because every Fibonacci number is the sum of the two preceding numbers.</p>
</li>
</ul>
<p>For more theory on recursion, check out these <a target="_blank" href="https://www.freecodecamp.org/news/tag/recursion/">articles</a>.</p>
<h3 id="heading-iterative-method">Iterative Method</h3>
<pre><code class="lang-go"><span class="hljs-comment">// main.go</span>

<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">fibIterative</span><span class="hljs-params">(position <span class="hljs-keyword">uint</span>)</span> <span class="hljs-title">uint</span></span> {
    slc := <span class="hljs-built_in">make</span>([]<span class="hljs-keyword">uint</span>, position)
    slc[<span class="hljs-number">0</span>] = <span class="hljs-number">1</span>
    slc[<span class="hljs-number">1</span>] = <span class="hljs-number">1</span>

    <span class="hljs-keyword">if</span> position &lt;= <span class="hljs-number">2</span> {
        <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>
    }

    <span class="hljs-keyword">var</span> result, i <span class="hljs-keyword">uint</span>
    <span class="hljs-keyword">for</span> i = <span class="hljs-number">2</span>; i &lt; position; i++ {
        result = slc[i<span class="hljs-number">-1</span>] + slc[i<span class="hljs-number">-2</span>]
        slc[i] = result
    }

    <span class="hljs-keyword">return</span> result
}
</code></pre>
<p>This code implements an <strong>iterative</strong> approach to calculate the Fibonacci sequence in Go, which is different from the <strong>recursive</strong> approach. Here’s a breakdown of how it works:</p>
<h4 id="heading-1-function-1">1. <strong>Function:</strong></h4>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">fibIterative</span><span class="hljs-params">(position <span class="hljs-keyword">uint</span>)</span> <span class="hljs-title">uint</span></span>
</code></pre>
<ul>
<li><p><code>func</code>: This keyword declares a function in Go.</p>
</li>
<li><p><code>fibIterative</code>: The name of the function suggests that it calculates Fibonacci numbers using iteration (a loop).</p>
</li>
<li><p><code>position uint</code>: The function takes one argument, <code>position</code>, which is an unsigned integer (<code>uint</code>). This represents the position of the Fibonacci sequence you want to calculate.</p>
</li>
<li><p><code>uint</code>: The function returns an unsigned integer (<code>uint</code>), which will be the Fibonacci number at the specified position.</p>
</li>
</ul>
<h4 id="heading-2-creating-a-slice-array-like-structure">2. <strong>Creating a Slice (Array-like structure):</strong></h4>
<pre><code class="lang-go">slc := <span class="hljs-built_in">make</span>([]<span class="hljs-keyword">uint</span>, position)
</code></pre>
<ul>
<li><code>slc</code> is a slice (a dynamic array in Go) that is created with the length of <code>position</code>. This slice will store Fibonacci numbers at each index.</li>
</ul>
<h4 id="heading-3-initial-values-for-fibonacci-sequence">3. <strong>Initial Values for Fibonacci Sequence:</strong></h4>
<pre><code class="lang-go">slc[<span class="hljs-number">0</span>] = <span class="hljs-number">1</span>
slc[<span class="hljs-number">1</span>] = <span class="hljs-number">1</span>
</code></pre>
<ul>
<li>The first two Fibonacci numbers are both <code>1</code>, so the first two positions in the slice (<code>slc[0]</code> and <code>slc[1]</code>) are set to <code>1</code>.</li>
</ul>
<h4 id="heading-4-early-return-for-small-positions">4. <strong>Early Return for Small Positions:</strong></h4>
<pre><code class="lang-go"><span class="hljs-keyword">if</span> position &lt;= <span class="hljs-number">2</span> {
    <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>
}
</code></pre>
<ul>
<li>If the input <code>position</code> is <code>1</code> or <code>2</code>, the function directly returns <code>1</code>, because the first two Fibonacci numbers are always <code>1</code>.</li>
</ul>
<h4 id="heading-5-iterative-loop">5. <strong>Iterative Loop:</strong></h4>
<pre><code class="lang-go"><span class="hljs-keyword">var</span> result, i <span class="hljs-keyword">uint</span>
<span class="hljs-keyword">for</span> i = <span class="hljs-number">2</span>; i &lt; position; i++ {
    result = slc[i<span class="hljs-number">-1</span>] + slc[i<span class="hljs-number">-2</span>]
    slc[i] = result
}
</code></pre>
<ul>
<li><p>The loop starts from <code>i = 2</code> and runs until it reaches the <code>position</code>.</p>
</li>
<li><p>In each iteration, the Fibonacci number at index <code>i</code> is calculated as the sum of the two previous Fibonacci numbers (<code>slc[i-1]</code> and <code>slc[i-2]</code>).</p>
</li>
<li><p>The result is stored both in <code>result</code> and in the slice <code>slc[i]</code> for future calculations.</p>
</li>
</ul>
<h4 id="heading-6-returning-the-result">6. <strong>Returning the Result:</strong></h4>
<pre><code class="lang-go"><span class="hljs-keyword">return</span> result
</code></pre>
<ul>
<li>Once the loop finishes, the variable <code>result</code> holds the Fibonacci number at the desired position, and the function returns it.</li>
</ul>
<p>This is a more <em>efficient</em> approach to calculating Fibonacci numbers compared to recursion, especially when <code>position</code> is large, because <strong>it doesn’t repeat unnecessary calculations</strong> and we are proving by using benchmark tests<strong><em>.</em></strong> Let’s prove it.</p>
<h2 id="heading-how-to-run-the-benchmark-tests">How to Run the Benchmark Tests</h2>
<p>Now, for the benchmark tests, let’s write some test. First, you will need to create a <strong>maintest.go</strong> file. In it, using Golang's <a target="_blank" href="https://pkg.go.dev/testing@go1.22.3#hdr-Benchmarks">documentation</a> on benchmark tests, you can create the functions to be tested as follows:</p>
<pre><code class="lang-go"><span class="hljs-comment">// main_test.go</span>

<span class="hljs-comment">// Benchmark for Iterative Function</span>
<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">BenchmarkFibIterative</span><span class="hljs-params">(b *testing.B)</span></span> {
    <span class="hljs-keyword">for</span> i := <span class="hljs-number">0</span>; i &lt; b.N; i++ { 
        fibIterative(<span class="hljs-keyword">uint</span>(<span class="hljs-number">10</span>))
    }
}
<span class="hljs-comment">// Benchmark for Recursive Function</span>
<span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">BenchmarkFibRecursive</span><span class="hljs-params">(b *testing.B)</span></span> {
    <span class="hljs-keyword">for</span> i := <span class="hljs-number">0</span>; i &lt; b.N; i++ {
        fibRecursive(<span class="hljs-keyword">uint</span>(<span class="hljs-number">10</span>))
    }
}
</code></pre>
<p>Let's run the test for position 10 and then increase appropriately. To run the benchmark tests, you simply run the command <code>go test -bench=NameoftheFunction</code>.</p>
<p>If you want to know more about this command, check <a target="_blank" href="https://pkg.go.dev/testing@go1.22.3#Benchmark">here</a>. Let’s check the function for <strong>position 10</strong>:</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">BenchmarkFibIterative</span><span class="hljs-params">(b *testing.B)</span></span> {
    <span class="hljs-keyword">for</span> i := <span class="hljs-number">0</span>; i &lt; b.N; i++ { 
        fibIterative(<span class="hljs-keyword">uint</span>(<span class="hljs-number">10</span>))
    }
}
</code></pre>
<pre><code class="lang-go"><span class="hljs-keyword">go</span> test -bench=BenchmarkFibIterative
Results:
cpu: Intel(R) Core(TM) i7<span class="hljs-number">-7700</span>HQ CPU @ <span class="hljs-number">2.80</span>GHz
BenchmarkFibIterative<span class="hljs-number">-8</span>         <span class="hljs-number">27715262</span>                <span class="hljs-number">42.86</span> ns/op
PASS
ok      playground      <span class="hljs-number">2.617</span>s
</code></pre>
<p>Let’s analyze with the help of this image:</p>
<p><img src="https://dev-to-uploads.s3.amazonaws.com/uploads/articles/484ap11qw8d81b43gg0v.png" alt="visit https://www.practical-go-lessons.com/chap-34-benchmarks" width="967" height="277" loading="lazy"></p>
<p>According to the image, we have 8 cores for the tests, and no time limit (it will run until completion). It took <strong>27_715_262 iterations</strong> and <strong>1.651 seconds</strong> to complete the task.</p>
<pre><code class="lang-go"><span class="hljs-function"><span class="hljs-keyword">func</span> <span class="hljs-title">BenchmarkFibRecursive</span><span class="hljs-params">(b *testing.B)</span></span> {
    <span class="hljs-keyword">for</span> i := <span class="hljs-number">0</span>; i &lt; b.N; i++ {
        fibRecursive(<span class="hljs-keyword">uint</span>(<span class="hljs-number">10</span>))
    }
}
</code></pre>
<pre><code class="lang-go"><span class="hljs-keyword">go</span> test -bench=BenchmarkFibRecursive
Results:
cpu: Intel(R) Core(TM) i7<span class="hljs-number">-7700</span>HQ CPU @ <span class="hljs-number">2.80</span>GHz
BenchmarkFibRecursive<span class="hljs-number">-8</span>          <span class="hljs-number">6644950</span>               <span class="hljs-number">174.3</span> ns/op
PASS
ok      playground      <span class="hljs-number">1.819</span>s
</code></pre>
<p>Using the same image to analyze the result, in this case it took <strong>6_644_950 iterations</strong> and <strong>1.819 seconds</strong> to complete the task we have:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Fibonacci’s Function</td><td>Position</td><td>Iterations</td><td>Time to run (s)</td></tr>
</thead>
<tbody>
<tr>
<td>Iterative</td><td>10</td><td>27_715_262</td><td>1.651</td></tr>
<tr>
<td>Recursive</td><td>1<strong>0</strong></td><td>6_644_950</td><td>1.819</td></tr>
</tbody>
</table>
</div><p>The <strong>benchmark results</strong> show that the iterative approach is significantly more efficient than the recursive approach for calculating the Fibonacci sequence.</p>
<p>For position 10, the iterative function ran approximately <strong>27.7 million iterations</strong> in <strong>1.651 seconds</strong>, while the recursive function managed only <strong>6.6 million iterations</strong> in <strong>1.819 seconds</strong>. The iterative method outperformed the recursive method both in terms of iterations and time, highlighting its efficiency.</p>
<p>To proven even further this, let’s try with the <strong>position 40</strong> (4 times the previous value):</p>
<pre><code class="lang-go"><span class="hljs-comment">// Results for the Iterative Function</span>
cpu: Intel(R) Core(TM) i7<span class="hljs-number">-7700</span>HQ CPU @ <span class="hljs-number">2.80</span>GHz
BenchmarkFibIterative<span class="hljs-number">-8</span>          <span class="hljs-number">9904401</span>               <span class="hljs-number">114.5</span> ns/op
PASS
ok      playground      <span class="hljs-number">1.741</span>s

<span class="hljs-comment">// Results for the Recursive Function</span>
cpu: Intel(R) Core(TM) i7<span class="hljs-number">-7700</span>HQ CPU @ <span class="hljs-number">2.80</span>GHz
BenchmarkFibRecursive<span class="hljs-number">-8</span>                <span class="hljs-number">4</span>         <span class="hljs-number">324133575</span> ns/op
PASS
ok      playground      <span class="hljs-number">3.782</span>s
</code></pre>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Fibonacci’s Function</td><td>Position</td><td>Iterations</td><td>Time to run (s)</td></tr>
</thead>
<tbody>
<tr>
<td>Iterative</td><td>40</td><td>9_904_401</td><td>1.741</td></tr>
<tr>
<td>Recursive</td><td>40</td><td>4</td><td>3.782</td></tr>
</tbody>
</table>
</div><p>The benchmark results clearly highlight the efficiency difference between the iterative and recursive approaches for calculating Fibonacci again.</p>
<p>The <strong>iterative function</strong> completed approximately <strong>9.9 million iterations</strong> with an average execution time of <strong>114.5 nanoseconds per operation</strong>, finishing the benchmark in <strong>1.741 seconds</strong>. In stark contrast, the <strong>recursive function</strong> only completed <strong>4 iterations</strong> with an average execution time of <strong>324,133,575 nanoseconds per operation</strong> (over 324 milliseconds per call), taking <strong>3.782 seconds</strong> to finish.</p>
<p>These results demonstrate that the recursive approach is far less efficient due to repeated function calls and recalculations, making the iterative method vastly superior in both speed and resource usage, especially as input size increases.</p>
<p>Just out of curiosity, I tried <strong>position 60</strong> and it literally crashed the test:</p>
<pre><code class="lang-go"><span class="hljs-comment">// Results for the Iterative Function</span>
cpu: Intel(R) Core(TM) i7<span class="hljs-number">-7700</span>HQ CPU @ <span class="hljs-number">2.80</span>GHz
BenchmarkFibIterative<span class="hljs-number">-8</span>          <span class="hljs-number">7100899</span>               <span class="hljs-number">160.9</span> ns/op

<span class="hljs-comment">// Results for the Recursive Function</span>
SIGQUIT: quit
PC=<span class="hljs-number">0x7ff81935f08e</span> m=<span class="hljs-number">0</span> sigcode=<span class="hljs-number">0</span>

goroutine <span class="hljs-number">0</span> gp=<span class="hljs-number">0x3bf1800</span> m=<span class="hljs-number">0</span> mp=<span class="hljs-number">0x3bf26a0</span> [idle]:
runtime.pthread_cond_wait(<span class="hljs-number">0x3bf2be0</span>, <span class="hljs-number">0x3bf2ba0</span>)
...
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>If your production code is running slowly or is unpredictably slower, you can use this technique, combined with <a target="_blank" href="https://pkg.go.dev/runtime/pprof"><strong>pprof</strong></a> or other tools from the built-in testing package, to identify and test where your code is performing poorly and work on how to optimize it.</p>
<p>Remember: Code that is beautiful to the eyes is not necessarily more performant.</p>
<h3 id="heading-reference">Reference</h3>
<ul>
<li><p>Recursive &amp; Iterative functions to Fibonacci’s sequence <a target="_blank" href="https://gist.github.com/pedrobertao/a31466b3287f165f22d05f0fb2b066f2">here</a>.</p>
</li>
<li><p>Benchmark testing <a target="_blank" href="https://gist.github.com/pedrobertao/d435d9f1b0915cbc1cb54bc385f45104">here</a>.</p>
</li>
</ul>
<h3 id="heading-homework">Homework</h3>
<p>This <a target="_blank" href="https://www.meccanismocomplesso.org/en/the-fibonacci-series-three-different-algorithms-compared/">article</a> explains why for some small numbers, the recursive strategy is better. Can you find a better way to improve the recursive function? (Tip: use Dynamic Programming).</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
