<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ budget - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ budget - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Fri, 22 May 2026 17:40:26 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/budget/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ How to Market Your Mobile App On a Budget ]]>
                </title>
                <description>
                    <![CDATA[ By Andrej Kovacevic Building an app is a difficult task in itself. From conceptualization, to market research, design, and development, it takes a whole lot of resources. But if you are making something for the public, you know building it is only ha... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-market-your-mobile-app-on-a-budget/</link>
                <guid isPermaLink="false">66d45da19f2bec37e2da0602</guid>
                
                    <category>
                        <![CDATA[ budget ]]>
                    </category>
                
                    <category>
                        <![CDATA[ marketing ]]>
                    </category>
                
                    <category>
                        <![CDATA[ mobile app development ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Fri, 10 Dec 2021 21:36:23 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2021/10/photo-1514575110897-1253ff7b2ccb.jpg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Andrej Kovacevic</p>
<p>Building an app is a difficult task in itself. From conceptualization, to market research, design, and development, it takes a whole lot of resources.</p>
<p>But if you are making something for the public, you know building it is only half of the work, right? If you want your app to reach as many people as possible, you will have to take another step: marketing.</p>
<p>Marketing is another beast altogether. And, it needs as much commitment as building the app. It's something you need to tackle even in the earliest stages of app development.</p>
<p>But the problem that plagues most solo developers is, "Where do I get the money?"</p>
<p>Traditional marketing does require quite a bit of cash. Advertising is one of the most lucrative industries, after all. Still, there must be something the rest of us cash-strapped individuals can do, right?</p>
<p>Of course there is. If you are on a limited budget, here are a couple of tips to get you started with marketing your masterpiece of an app.</p>
<h2 id="heading-be-active-on-social-media"><strong>Be Active on Social Media</strong></h2>
<p>No matter what your app is about, most of your audience will be on social media. And leveraging social media is one of the most cost-effective ways to promote your app.</p>
<p><a target="_blank" href="https://www.pewresearch.org/internet/2021/04/07/social-media-use-in-2021/">Seven out of ten</a> American adults use some form of social media. Among them, the most popular platforms are YouTube (81%) and Facebook (71%).</p>
<p>You can sign up to all the basic social media platforms free of charge and craft your campaigns from there. </p>
<p>Social media makes it easier for potential users to find your app and see what it’s about. These platforms are an avenue for people to interact, giving you a direct line to your audience.</p>
<p>It is also a great avenue to answer questions and interact with users. When done well, social media marketing can build up your reputation. It can also generate word-of-mouth marketing.</p>
<p>Creative social media campaigns can generate a lot of buzz around your app. A lot of brands and apps have surged in popularity due to well-executed campaigns.</p>
<p><img src="https://lh3.googleusercontent.com/tny6hjvIf3v4mmJiEfLOTAlmSqwT1uKkOSvwHP8eBycUDMfsemf6W710ZalSB5YnnTSZrV1q6FeHsTDJ_XVNmSP9u-Wqb2FaT9quEpEpyGTtl6zjWdMwxVp5tbnxkmMnDLFiIlnv" alt="Image" width="600" height="400" loading="lazy"></p>
<p>To launch a successful campaign, you will need some good market research. It's a good thing that most social media platforms will give you access to your account analytics.</p>
<p>It's a good practice to keep your eye on it, as your analytics provide a wealth of information. It shows which posts gain the most attention, your audience demographics, among others.</p>
<p>All these for an initial payment of $0. If you have the budget, you can transition to paid ads to target your desired audience and get more buzz.</p>
<h2 id="heading-create-a-website-or-blog"><strong>Create a Website or Blog</strong></h2>
<p><a target="_blank" href="https://www.sweor.com/firstimpressions">Building a website</a> is one of the most established ways to promote online. It gives users a one-stop-shop to get acquainted with your app and its many facets. A self-titled domain also gives your app more credibility.</p>
<p>You can add different pages describing your app and its features. Maintaining a blog is also an excellent way to provide updates and boost your search engine rankings.</p>
<p>But as most of us know, building and hosting a website can cost some money. If you're on a tight budget, don't worry, you still have some choices.</p>
<p>Many websites offer free website building and hosting, which can fit a variety of needs. Some popular examples are WordPress, Wix, and Weebly. Building a website on WordPress is as easy as 1,2,3, as evidenced by the screenshots below.</p>
<p><img src="https://lh5.googleusercontent.com/e0a1CC8N4EZpbYLYyBJiaiXnMhxErkNAD4POOhTPeGuvP2OJxOa3n9tw8dpLfsa9m0J8L5FASBb6Pl2iXvA_Xqr2s5UfceqPLK0stsBjetZ_eDiY28JY-woguh_ylz-N5l75t5kV" alt="Image" width="600" height="400" loading="lazy"></p>
<p><img src="https://lh5.googleusercontent.com/d2Rsl1wKLT0kZOziuvEqoubKW4PSLaN3877uc7cju0aLmz6kr05xdhJsuvh_TA56BuBzvRcA2xA9cmWyvQfx3uHrXxB79OYQ69HQbEpAzYDx-_66NN6Q1L-pcZw6gIDOEb2sNhdX" alt="Image" width="600" height="400" loading="lazy"></p>
<p><img src="https://lh4.googleusercontent.com/tzv6uShnJzF9l9kqYeiSVAMaePi-sjXrmGddX5tvAUoQGqVT2JJve6UdTcgZR8GOPOyd2pnsxHA4wpmawLKRzwdUGnHoRlAWLbkhRvXxkznJZhM-fmNoQGEebml64kilw9R8tyJI" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Their free services will add their brand to your site domain. So your web address will be appname.wordpress.com, for example. If you want your own <a target="_blank" href="https://www.wpbeginner.com/beginners-guide/beginners-guide-what-is-a-domain-name-and-how-do-domains-work/">domain name</a>, you will need to pay a monthly fee.</p>
<p>Still, you can check out their website builders, which are pretty user-friendly. You can use their drag-and-drop features for a more convenient experience. But if you want, you can also access the code for more control.</p>
<p><img src="https://lh4.googleusercontent.com/bbulVRom6bXJLTWEeoC4eHEG9EKTbzWQmrWuTYfhGQj3cbY9Z8OtofY75iAxaoQQbX0QekVbou3L75sdpN2OU0mlKgs5EzYNT3SA1q_35oksqxNt8O8vquNDWhtv5ejn5d8su31h=s1600" alt="Image" width="600" height="400" loading="lazy">
<em>Image source: <a target="_blank" href="https://unsplash.com/photos/eveI7MOcSmw">unsplash.com</a></em></p>
<h2 id="heading-reach-out-to-influencers">Reach Out to Influencers</h2>
<p>With the popularity of social media, influencer marketing has seen a steady rise.<br>What are influencers, anyway? Influencers are social media users with a significant following. Due to their established audience, they become an attractive avenue for advertising.</p>
<p>A majority of marketing companies have a standalone content marketing budget. <a target="_blank" href="https://influencermarketinghub.com/influencer-marketing-benchmark-report-2021/">Seventy-five percent of them</a> intend to have a dedicated budget for influencer marketing.</p>
<p>Most influencers have followers that trust and admire their lifestyle and choices. Partnering with an influencer is an excellent way to add social proof. It can also build your app's reputation.</p>
<p>There is a whole lot more to influencer marketing. You will need to take some thought when selecting who you want to promote your app. </p>
<p>Selecting influencers you're going to work with requires you to know your audience. Look for someone within the same demographic or someone who has the same interests.</p>
<p>More popular influencers often charge a more significant fee. So if you're on a budget, you might want to look at ones with a smaller yet still considerable following.</p>
<p><img src="https://lh4.googleusercontent.com/nVaKn5gk0j9pkKkgiotD4MJBVOtdtMRWNpZQJdIGwN0Ji4ebzoaVFgyPDBoDv9j2z-ZrWZ9RYUE1lKKU84yg2bW1fSDf27-rZuHxLbp9rDdYsLpA0BrfdoUADmk-vK14psmmyQHI" alt="Image" width="600" height="400" loading="lazy"></p>
<p>Working with smaller influencers or micro-influencers is not a bad thing. Micro-influencers are a popular choice among brands today.</p>
<p>Prominent influencers are now more like celebrities. Micro-influencers have fewer followers but often come across as more genuine. They can make organic content with a more robust audience connection.</p>
<p>People rely a lot on word of mouth when making purchases. They are more likely to follow the advice of someone they trust. Micro-influencers have this kind of relationship with their audience. Working with them can add a boost to your app.</p>
<h2 id="heading-get-your-app-featured"><strong>Get Your App Featured</strong></h2>
<p>When it comes to promotion, nothing beats a feature <a target="_blank" href="https://www.reputio.com/become-forbes-contributor/">from a trusted media outlet</a>. It's one of the best ways to reach even more casual internet users.</p>
<p>This is where you and your team's PR skills come in. You will need to write <a target="_blank" href="https://www.forbes.com/sites/cherylsnappconner/2013/10/13/how-to-pitch-the-press-the-8-no-fail-strategies/">pitches</a> that you can send to websites where you want your app featured.</p>
<p>To get higher chances of approval, do some research about these websites first. What kind of content do they usually produce? What writing styles do they use? Tailor your pitch around these factors.</p>
<p>You can also reach out to official <a target="_blank" href="https://www.the-next-tech.com/mobile-apps/best-app-review-websites-for-marketing-and-promotions-in-2020/">review sites</a> and ask them to do a feature on your app. If you've already released your app, you could provide them with a download link. If it's still under development, you could offer demo videos and high-quality screenshots.</p>
<p>Some other ways to get featured are through podcasts, vlogs, or other media. Reaching out can be pretty time-consuming, so get your networking skills ready!</p>
<h2 id="heading-work-on-your-app-store-optimization"><strong>Work on Your App Store Optimization</strong></h2>
<p>On Google Play alone, there are already <a target="_blank" href="https://www.statista.com/statistics/289418/number-of-available-apps-in-the-google-play-store-quarter/">3.48 million apps</a>. How do you stand out from an ocean of competition?</p>
<p><a target="_blank" href="https://appradar.com/academy/what-is-app-store-optimization-aso">App Store Optimization</a> (ASO) is one of the things to consider when promoting your app. It is like Search Engine Optimization (<a target="_blank" href="https://searchengineland.com/guide/what-is-seo">SEO</a>), but for app stores.</p>
<p>Your marketing campaigns may be top-notch. Still, without ASO, you're missing out on most of your audience. A significant 70% of users find apps via app store search. Furthermore, 65% of downloads happen right after a search.</p>
<p>What does this mean? Pay attention to your ASO. If you can, create a campaign around it, but here are some key factors you should pay attention to.</p>
<ul>
<li><strong>App Name.</strong> It needs to be catchy and relevant to your app's features.</li>
<li><strong>App description.</strong> Make sure to incorporate relevant keywords to help your app rank higher.</li>
<li><strong>Icon.</strong> Create an eye-catching icon that will attract users.</li>
<li><strong>Screenshots and videos.</strong> Prepare short videos and attractive screenshots that show your app at its best.</li>
<li><strong>Ratings and reviews.</strong> User feedback matters. Many users look at reviews before downloading, so keep up your customer service!</li>
</ul>
<p>One last thing for ASO: Do your research and look at the top apps on Google Play right now. How did they structure their app page to attract and sustain the most users?</p>
<p>You can use a nifty tracker from the similarweb Blog to look at the <a target="_blank" href="https://www.similarweb.com/apps/top/google/app-index/us/all/top-free/">most downloaded apps</a> on Google Play. Data is an excellent way to study the factors that affect an app's performance.</p>
<h2 id="heading-final-words"><strong>Final Words</strong></h2>
<p>We can never deny that funding makes a lot of things easier. But if you lack in this department, creativity can go a long way. </p>
<p>There are tons of ways to promote your app. Some of them cost money, but you can make up for it with some skill and hard work. Make do with what you have and can gain success from it. </p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Project budgeting: an anti-pattern ]]>
                </title>
                <description>
                    <![CDATA[ By Bertil Muth The desired benefits of agile development are many. Customers are happier and more willing to buy. Lead time from idea to delivery is shorter. Employees are more motivated and more productive.  Sounds good, doesn't it? In practice, cer... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/project-budgeting-an-anti-pattern/</link>
                <guid isPermaLink="false">66d45de8bc9760a197a1035f</guid>
                
                    <category>
                        <![CDATA[ agile ]]>
                    </category>
                
                    <category>
                        <![CDATA[ agile development ]]>
                    </category>
                
                    <category>
                        <![CDATA[ budget ]]>
                    </category>
                
                    <category>
                        <![CDATA[ project management ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Scrum ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Sun, 08 Sep 2019 07:58:16 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2019/09/1-qF4e5LHYdx78lBJUP7TJsw.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Bertil Muth</p>
<p>The desired benefits of agile development are many. Customers are happier and more willing to buy. Lead time from idea to delivery is shorter. Employees are more motivated and more productive. </p>
<p>Sounds good, doesn't it? In practice, certain behaviors stand in the way of the benefits. As a scrum master and agile coach, I see the same anti-patterns in many companies over and over again. Today's anti-pattern is about project budgeting and contracts.</p>
<h2 id="heading-fixing-the-scope-not-a-good-idea">Fixing the scope - not a good idea</h2>
<p>In many companies, the scope of delivery must be decided on first. Then someone estimates the effort and calculates the necessary budget. Finally, there is a go or no-go decision for the project.</p>
<p>It is similar for many customer supplier relationships. First, a contract must be set up that describes exactly what will be delivered. Otherwise there is no order.</p>
<p>How else could it be? How else can you, as a customer, be sure what will come out in the end?</p>
<p>As an agilist, here's my answer. Customers can't possibly know the exact end result in advance. Software development is full of surprises. The requirements of today are history tomorrow.</p>
<p>It's important to recognize the tension between predictability and changeability. You can't have both: perfect predictability of the end result at the beginning <em>and</em> maximum consideration of change requests later.</p>
<h2 id="heading-what-you-can-do-instead">What you can do instead</h2>
<p>At the beginning of agile development, it is sufficient for stakeholders to agree on a product vision, and maybe a rough roadmap. Funding can be done incrementally, sprint by sprint.</p>
<p>But what if this is not possible in your company? What if the stakeholders don't want to give up the illusion of safety that comes from clarifying requirements early?</p>
<p>Well, it's a waste of time to specify details of requirements that will change later. So what to do?</p>
<p>You can agree on user stories in advance. That makes sense. But don't define the acceptance criteria! Clarification of details is postponed to development, shortly before implementation.</p>
<p>The advantage: you can react to lessons learned from development and new customer needs.</p>
<h2 id="heading-change-management-done-right">Change management done right</h2>
<p>Either at the beginning of a project or during contract negotiation, stakeholders should agree on the mode of cooperation and the handling of changes later.</p>
<p>It is important to follow a few important principles.</p>
<p>If there are changes, a change management process that evaluates each change individually is NOT enough. Instead, a change should be related to other changes.</p>
<p>For each change that is implemented, either</p>
<ul>
<li>take another requirement out of scope, or</li>
<li>increase the duration of the project.</li>
</ul>
<p>During development, individual elements can therefore be replaced by elements that are equivalent in terms of development effort. Scrum supports this with the Product Backlog–by pulling a backlog item up, the other elements automatically have less urgency.</p>
<p>If you want to know more about agile contract design, I recommend you have a look at <a target="_blank" href="https://www.scruminc.com/agile-contracts-money-for-nothing-and/">Money for Nothing and Your Change for Free</a>.</p>
<p><em>To <a target="_blank" href="https://skl.sh/2Cq497P">get the basics of agile software development right</a>, visit my online course. If you want to keep up with what I'm doing or drop me a note, follow me on <a target="_blank" href="https://dev.to/bertilmuth">dev.to</a>, <a target="_blank" href="https://www.linkedin.com/in/bertilmuth/">LinkedIn</a> or <a target="_blank" href="https://twitter.com/BertilMuth">Twitter</a>. Or visit my <a target="_blank" href="https://github.com/bertilmuth/requirementsascode">GitHub project</a>.</em></p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How I planned my meals with Reinforcement Learning on a budget ]]>
                </title>
                <description>
                    <![CDATA[ By Sterling Osborne, PhD Researcher Following my recent article on applying Reinforcement Learning to real life problems, I decided to demonstrate this with a small example. The aim is to create an algorithm that can find a suitable choice of food pr... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-i-planned-my-meals-with-reinforcement-learning-on-a-budget-a82aac906ada/</link>
                <guid isPermaLink="false">66c34e03f41767c3c96bacca</guid>
                
                    <category>
                        <![CDATA[ budget ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Data Science ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Machine Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Reinforcement Learning ]]>
                    </category>
                
                    <category>
                        <![CDATA[ technology ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Tue, 16 Apr 2019 16:00:10 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*DJoo_O-eNQAnYrc4blWzAg.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Sterling Osborne, PhD Researcher</p>
<p>Following <a target="_blank" href="https://medium.freecodecamp.org/how-to-apply-reinforcement-learning-to-real-life-planning-problems-90f8fa3dc0c5">my recent article on applying Reinforcement Learning to real life problems</a>, I decided to demonstrate this with a small example. The aim is to create an algorithm that can find a suitable choice of food products to fit within a budget and meet my personal preferences.</p>
<p>I have also posted the description, data and code kernel to Kaggle and this can be found <a target="_blank" href="https://www.kaggle.com/osbornep/reinforcement-learning-for-meal-planning-in-python/notebook">here</a>.</p>
<p>Please let me know if you have any questions or suggestions.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/LZJGp50r3YXHLlCWaMEoqYBhRwvpxdjsdgwa" alt="Image" width="800" height="533" loading="lazy">
<em>Photo: Pixabay</em></p>
<h3 id="heading-aim">Aim</h3>
<p>When food shopping, there are many different products for the same ingredient to choose from in supermarkets. Some are less expensive, others are of higher quality. I would like to create a model that, for the required ingredients, can select the optimal products required to make a meal that is both:</p>
<ol>
<li>Within my budget</li>
<li>Meets my personal preferences</li>
</ol>
<p>To do this, I will first build a very simple model that can recommend the products that are below my budgets before introducing my preferences.</p>
<p>The reason we use a model is so that we could, in theory, scale the problem to consider more and more ingredients and products that would cause the problem to then be beyond the possibility of any mental calculations.</p>
<h3 id="heading-method">Method</h3>
<p>To achieve this, I will be building a simple reinforcement learning model and I’ll use Monte Carlo learning to find the optimal combination of products.</p>
<p>First, let us formally define the parts of our model as a Markov Decision Process:</p>
<ul>
<li>We have a finite number of ingredients required to make any meal and are considered to be our <strong>States</strong></li>
<li>There are the finite possible products for each ingredient and are therefore the <strong>Actions of each state</strong></li>
<li>Our preferences become the <strong>Individual Rewards</strong> for selecting each product, we will cover this in more detail later</li>
</ul>
<p>Monte Carlo learning takes the combined the quality of each step towards reaching an end goal and requires that, in order to assess the quality of any step, we must wait and see the outcome of the whole combination. This process is repeated over and over again in episodes with many different products until is finds the selection that appears to lead to a positive outcome repeatedly. This is the reinforcement learning process where our environment is simulated based on the knowledge about costs and preferences we obtained.</p>
<p>Monte Carlo is often avoided due to the time required to go through the whole process before being able to learn. However, in our problem it is required as our final check when establishing whether the combination of products selected is good or bad is to add up the real cost of those selected and check whether or not this is below or above our budget. Furthermore, at least at this stage, we will not be considering more than a few ingredients and so the time taken is not significant in this regard.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/5XabGzV2o9PFKK7nUoP-QHSShPx6a2ur1XdM" alt="Image" width="710" height="273" loading="lazy">
_[https://www.tractica.com/artificial-intelligence/reinforcement-learning-and-its-implications-for-enterprise-artificial-intelligence/](https://www.tractica.com/artificial-intelligence/reinforcement-learning-and-its-implications-for-enterprise-artificial-intelligence/" rel="noopener" target="<em>blank" title=")</em></p>
<h3 id="heading-sample-data">Sample Data</h3>
<p>For this demonstration, I have created some sample data for a meal where we have 4 ingredients and 9 products, as shown in the diagram below.</p>
<p>We need to select one product for each ingredient in the meal.</p>
<p>This means we have 2 x 2 x 2 x 3 = 24 possible selections of products for the 4 ingredients.</p>
<p>I have also included the real cost for each product and V_0.</p>
<p>V_0 is simply the initial quality of each product to meet our requirements and we set this to 0 for each.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/eCjXwnetr8IA787ykWpZq4k6OdbHEroKoA5M" alt="Image" width="800" height="446" loading="lazy">
<em>Diagram showing the possible product choices for each ingredient</em></p>
<p>First, we import the required packages and data.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/IJ5NNxRwWJQ8QcwXNicYIDoxXZjcEGzOdzRz" alt="Image" width="800" height="396" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/YvGNf8QiGazYIDSMQhdmNWKU8TnfviITZHrd" alt="Image" width="394" height="310" loading="lazy"></p>
<h3 id="heading-applying-the-model-in-theory">Applying the Model in Theory</h3>
<p>For now, I will not introduce any individual rewards for the products. Instead, I will simply focus on whether the combination of products selected is below our budget or not. This outcome is defined as the <strong>Terminal Reward</strong> of our problem.</p>
<p>For example, say we have a budget of £30, then the choice:</p>
<p>a1→b1→c1→d1</p>
<p>Then the real cost of this selection is:</p>
<p>£10+£8+£3+£8 = £29 &lt; £30</p>
<p>And therefore, our terminal reward is:</p>
<p>R_T=+1</p>
<p>Whereas,</p>
<p>a2→b2→c2→d1</p>
<p>Then the real cost of this selection is:</p>
<p>£6+£11+£7+£8 = £32 &gt; £30</p>
<p>And therefore, our terminal reward is:</p>
<p>R_T=−1</p>
<p>For now, we are simply telling our model whether the choice is good or bad and will observe what this does to the results.</p>
<h3 id="heading-model-learning">Model Learning</h3>
<p>So how does our model actually learn? In short, we get our model to try out lots of combinations of products and at the end of each tell it whether its choice was good or bad. Over time, it will recognise that some products generally lead to getting a good outcome while others do not.</p>
<p>What we end up creating are values for how good each product is, denoted V(a). We have already introduced the initial V(a) for each product, but how do we reach go from these initial values to actually being able to make a decision?</p>
<p>For this, we need an <strong>Update Rule</strong>. This tells the model, after each time it has presented its choice of products and we have told it whether it’s selection is good or bad, how to add this to our initial values.</p>
<p>Our update rule is as follows:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/tBfSiFmawCqLJM2rP0VpSrWRa7btGNT4iOiw" alt="Image" width="247" height="41" loading="lazy"></p>
<p>This may look unusual at first but in words we are simply updating the value of any action, V(a), by an amount that is either a little more if the outcome was good or a little less if the outcome was bad.</p>
<p>G is the <strong>Return</strong> and is simply to total reward obtained. Currently in our example, this is simply the terminal reward (+1 or -1 accordingly). We will reintroduce this later when we include individual product rewards.</p>
<p>Alpha, αα, is the <strong>Learning Rate</strong> and we will demonstrate how this effects the results more later but just for now, the simple explanation is: “The learning rate determines to what extent newly acquired information overrides old information. A factor of 0 makes the agent learn nothing, while a factor of 1 makes the agent consider only the most recent information.” (<a target="_blank" href="https://en.wikipedia.org/wiki/Q-learning">https://en.wikipedia.org/wiki/Q-learning</a>)</p>
<h3 id="heading-small-demo-of-updating-values">Small Demo of Updating Values</h3>
<p>So how do we actually use this with our model?</p>
<p>Let us start with a table that has each product and its initial V_0(a):</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/dKkesqok94JNwSZHLDBzEdKnmyKP5NmqwERQ" alt="Image" width="140" height="311" loading="lazy"></p>
<p>We now pick a random selection of products, each combination is known as an <strong>episode</strong>. We also set α=0.5α=0.5 for now just for simplicity in the calculations.</p>
<p>For example:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/rAPEHqEY3U0XJekgOwusn6dYbxpQgHQ4ghhl" alt="Image" width="314" height="305" loading="lazy"></p>
<p>Therefore, all actions that lead to this positive outcome are updated as well to produced the following table with V1(a):</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/GSmEiJ8IZIYsbsUaieXz4jBQfjeL3wzch1HC" alt="Image" width="191" height="313" loading="lazy"></p>
<p>So let us try again by picking another random episode:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/r7aeYiWCHv9zR0d9e5E2QbbKaN17BjllHdz-" alt="Image" width="397" height="534" loading="lazy"></p>
<p>Therefore, we can add V2(a) to our table:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/E3Y525fxV7LX8vcdTvqVNyaGT0GAQ8vkLb4I" alt="Image" width="263" height="320" loading="lazy"></p>
<h3 id="heading-action-selection">Action Selection</h3>
<p>You may have noticed in the demo, I have simply randomly selected the products in each episode. We could do this, but using a completely random selection process may mean that some actions are not selected often enough to know whether they are good or bad.</p>
<p>Similarly, if we went another way and decided to select the products greedily, i.e. to ones that currently have the best value, we may miss one that is in fact better but was never given a chance. For example, if we chose the best actions from V2(a) we would get a2, b1, c1 and d2 or d3 which both provide a positive terminal reward therefore, if we used a purely greedy selection process, we would never consider any other products as these continue to provide a positive outcome.</p>
<p>Instead, we implement <strong>epsilon-greedy</strong> action selection where we randomly select products with probability ϵ, and greedily select products with probability 1−ϵ1−ϵ where:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/w7prUkGhYtx6PfE16dfGTFC2QTZG0DhbEWRy" alt="Image" width="92" height="36" loading="lazy"></p>
<p>This means that we are going reach the optimal choice of products quickly, as we continue to test whether the ‘good’ products are in fact optimal. But it also leaves room for us to also explore other products occasionally, just to make sure they aren’t as good as our current choice.</p>
<h3 id="heading-building-and-applying-our-model">Building and Applying our Model</h3>
<p>We are now ready to build a simple model as shown in the MCModelv1 function below.</p>
<p>Although this seems complex, I have done nothing more than apply the methods previously discussed in such a way that we can vary the inputs and still obtain results. Admittedly, this was my first attempt at doing this and so my coding may not be perfectly written but should be sufficient for our requirements.</p>
<p>To calculate the terminal reward, we currently use the following condition to check if the total cost is less or more than our budget:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/3Wqi4imz2FQDmGOWAj96pNV3lhw6PcEv-BwO" alt="Image" width="800" height="121" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/Sh2bBPQoanKimfAFPNLATlzHIRVZeMsapqVI" alt="Image" width="653" height="339" loading="lazy"></p>
<p><strong>The full code for the model is too large to fit here nicely, but can be found at the linked <a target="_blank" href="https://www.kaggle.com/osbornep/reinforcement-learning-for-meal-planning-in-python/notebook">Kaggle</a> page.</strong></p>
<h4 id="heading-we-now-run-our-model-with-some-sample-variables">We now run our model with some sample variables:</h4>
<p><img src="https://cdn-media-1.freecodecamp.org/images/wsawgTD72Mfaf2GqTzg9-2EPXgWHV7Jcx2Cq" alt="Image" width="680" height="356" loading="lazy"></p>
<p>In our function, we have 6 outputs from the model:</p>
<ul>
<li>Mdl[0]: Returns the Sum of all V(a) for each episode</li>
<li>Mdl[1]: Returns to Sum of V(a) for the cheapest products, possible to define due to the simplicity of our sample data</li>
<li>Mdl[2]: Returns the Sum of V(a) for the non-cheapest products</li>
<li>Mdl[3]: Returns the optimal actions of the final episode</li>
<li>Mdl[4]: Returns the data table with the final V(a) added for each product</li>
<li>Mdl[5]: Shows the optimal action at each episode</li>
</ul>
<p>There is a lot to take away from these, so let us go through each and establish what we can learn to improve our model.</p>
<h4 id="heading-optimal-actions-of-final-episode">Optimal actions of final episode</h4>
<p>First, let’s see what the model suggests we should select. In this run it suggests actions, or products, that have a total cost below budget which is good.</p>
<p>However, there is still more that we can check to help us understand what is going on.</p>
<p>First, we can plot the total V for all actions, and we see that this is converging, which is ideal. We want our model to converge so that as we try more episodes we are ‘zoning-in’ on the optimal choice of products. The reason the output converges is because we are reducing the amount it learns each time by a factor of αα, in this case 0.5. We will show later what happens if we vary this or don’t apply this at all.</p>
<p>We have also plotted the sum of V for the products we know are cheapest, based on being able to assess the small sample size, and the others separately. Again, both are converging positively although the cheaper products appear to have slightly higher values.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/4WTvyb0pa3be9kJmsvP8PQsIGSce2EdbQPLj" alt="Image" width="680" height="492" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/UFFckpQP0O2XLeE49eoTeaiVWmKY-fScgvJ0" alt="Image" width="677" height="763" loading="lazy"></p>
<h4 id="heading-so-why-is-this-happening-and-why-did-the-model-suggest-the-actions-it-did">So why is this happening and why did the model suggest the actions it did?</h4>
<p>To understand that, we need to dissect the suggestions made by the model at each episode and how this relates to our return.</p>
<p>Below, we have taken the optimal action for each state. We can see that the suggested actions do vary greatly between episodes and the model appears to decide which is wants to suggest very quickly.</p>
<p>Therefore, I have plotted the total cost of the suggested actions at each episode and we can see the actions vary initially then smooth out and the resulting total cost is below our budget. This helps us understand what is going on greatly.</p>
<p>So far, all we have told the model is to provide a selection that is below budget and it has. It has simply found a answer that is below the budget as required.</p>
<p>So what is the next step? Before I introduce rewards I want to demonstrate what happens if I vary some of the parameters and what we can do if we decide to change what we want our model to suggest.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/A5ywpBkRB2P3G-8WCwYu1-HQ2lAuON565bS3" alt="Image" width="681" height="325" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/LrC00RigDKHw90YQMaAnKVxogZJ9urKzrtP8" alt="Image" width="297" height="851" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/pMFzDzpPZRDDzpSpUuEb7mUbhLCKHD93kldf" alt="Image" width="637" height="763" loading="lazy"></p>
<h3 id="heading-effect-of-changing-parameters-and-how-to-change-models-aim">Effect of Changing Parameters and How to Change Model’s Aim</h3>
<p>We have a few parameters that can be changed:</p>
<ol>
<li>The Budget</li>
<li>Our learning rate, α</li>
<li>Out action selection parameter, ϵ</li>
</ol>
<h4 id="heading-varying-budget">Varying Budget</h4>
<p>First, let us observe what happens if we make our budget either impossibly low or high.</p>
<p>A small budget means we only obtain a negative reward means that we will force our V to converge negatively whereas a budget that is too high will cause our V to converge positively as all actions are continually positive.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/ClL2XcjSAPq4hzJWOLLygAyYiI4QS7-hmmYi" alt="Image" width="636" height="712" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/ePz3LMNDjQN1FQGHdcYHEF65gA9A1MOu7Ywh" alt="Image" width="638" height="721" loading="lazy"></p>
<p>The latter seems like what we had in our first run, a lot of the episodes lead to positive outcomes and so many combinations of products are possible and there is little distinction between the cheapest products from the rest.</p>
<p>If instead we consider a budget that is reasonably low given the prices of the products, we can see a trend where the cheapest products look to be converging positively and the more expensive products converging negatively. However, the smoothness of these is far from ideal, both appear to be oscillating greatly between each episode.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/9vYicIrKqHGM8unbhIyICWNeKLD7FRe0zKbm" alt="Image" width="639" height="610" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/sboascUNlUZ1xKs64W5x4g7Jd0tVOK4NYG3l" alt="Image" width="362" height="460" loading="lazy"></p>
<p>So what can we do the reduce the ‘spikiness’ of the outputs? This leads us onto our next parameter, alpha.</p>
<h3 id="heading-varying-alpha">Varying Alpha</h3>
<h4 id="heading-a-good-explanation-of-what-is-going-on-with-our-output-due-to-alpha-is-described-by-stack-overflow-user-vishalthebeast">A good explanation of what is going on with our output due to alpha is described by stack overflow user VishalTheBeast:</h4>
<blockquote>
<p>“Learning rate tells the magnitude of step that is taken towards the solution.</p>
<p>It should not be too big a number as it may continuously oscillate around the minima and it should not be too small of a number else it will take a lot of time and iterations to reach the minima.</p>
<p>The reason why decay is advised in learning rate is because initially when we are at a totally random point in solution space we need to take big leaps towards the solution and later when we come close to it, we make small jumps and hence small improvements to finally reach the minima.</p>
<p>Analogy can be made as: in the game of golf when the ball is far away from the hole, the player hits it very hard to get as close as possible to the hole. Later when he reaches the flagged area, he choses a different stick to get accurate short shot.</p>
<p>So it’s not that he won’t be able to put the ball in the hole without choosing the short shot stick, he may send the ball ahead of the target two or three times. But it would be best if he plays optimally and uses the right amount of power to reach the hole. Same is for decayed learning rate.” — <a target="_blank" href="https://stackoverflow.com/questions/33011825/learning-rate-of-a-q-learning-agent">source</a></p>
</blockquote>
<p>To better demonstrate the effect of varying our alpha, I will be using an animated plot created using Plot.ly.</p>
<p>I have written a more detailed guide on how to do this <a target="_blank" href="https://towardsdatascience.com/creating-interactive-animation-for-parameter-optimisation-using-plot-ly-8136b2997db">here</a>.</p>
<p>In our first animation, we vary alpha between 1 and 0.1. This enables us to see that as we reduce alpha our output smooths somewhat but it still pretty rough.</p>
<p>However, even though the results are smoothing out, they are no longer converging in 100 episodes and, furthermore, they output seems to alternate between each alpha. This is due to a combination of small alphas requiring more episodes to learn and out action selection parameter epsilon being 0.5. Essentially, the output is still being decided by randomness half of the time and so out results are not converging within the 100 episode frame.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/aB38O-aTjeYWBdtMd9NRPAmkFpfCKfzR0qPB" alt="Image" width="638" height="644" loading="lazy"></p>
<p>Running this through our animated plots produces something similar to the following:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/wVLIs9ttJH3P27B50Xf9rCYg0x4YxU6otsy5" alt="Image" width="600" height="288" loading="lazy"></p>
<h3 id="heading-varying-epsilon">Varying Epsilon</h3>
<p>With the previous results in mind, we now fix alpha to be 0.05 and vary epsilon between 1 and 0 to show the effect of completely randomly selecting actions to selecting actions greedily.</p>
<p>The graphs below show three snapshots from varying epsilon, but the animated version can be viewed in the <a target="_blank" href="https://www.kaggle.com/osbornep/reinforcement-learning-for-meal-planning-in-python/notebook">Kaggle</a> kernel.</p>
<p>We see that having a high epsilon creates very sporadic results. Therefore we should select something reasonably small like 0.2. Although have epsilon equal to 0 looks good because of how smooth the curve is, as we mentioned earlier, this may lead us to a choice very quickly but may not be the best. We want some randomness so the model can explore other actions if needed.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/MSioaunlsQvp2AADkQjHB0R6X6yAfTEAIn0Q" alt="Image" width="700" height="450" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/Rnmcx-e31oLcKZA9-M4fyHHs5MOdTPoQXmdW" alt="Image" width="700" height="450" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/R8FGYGkQjh55aapbnQs3TS1dBsC7iU7uKLaX" alt="Image" width="700" height="450" loading="lazy"></p>
<h3 id="heading-increasing-the-number-of-episodes">Increasing the Number of Episodes</h3>
<p>Lastly, we can increase the number of episodes. I refrained from doing this sooner because we were running 10 models in a loop to output our animated graphs and this would have caused the time taken to run the model to explode.</p>
<p>We noted that a low alpha would require more episodes to learn so we can run our model for 1000 episodes.</p>
<p>However, we still notice that the output is oscillating, but, as mentioned before, this is due to our aim being simply to recommend a combination that is below budget. What this shows is that the model can’t find the single best combination when there are many that fit below our budget.</p>
<p>Therefore, what happens if we change our aim slightly so that we can use the model to find the cheapest combination of products?</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/2h7mS1jrLjv3KD47T77ARG-2tTscftXWGJUI" alt="Image" width="638" height="594" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/9xMMzsxigFz4Zx3n-womG45Q4qzWsYtvEBx4" alt="Image" width="364" height="445" loading="lazy"></p>
<h3 id="heading-changing-our-models-aim-to-find-the-cheapest-combination-of-products">Changing our Model’s Aim to Find the Cheapest Combination of Products</h3>
<p>This aim of this it to more clearly separate the cheapest products from the rest, and it nearly always provides us with the cheapest combination of products.</p>
<p>To do this, all we need do is adapt our model slightly to provide a terminal reward that is relative to how far below or above budget this combination in the episode is.</p>
<p>This can done by changing the calculation for return to:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/xW4GsM4rWI0XRPjxKYPn7dFmg5nz8DtuLeBM" alt="Image" width="638" height="114" loading="lazy"></p>
<p>We now see that the separation between the cheapest products and the others is emphasised.</p>
<p>This really demonstrates the flexibility of reinforcement learning and how easy it can be to adapt the model based on your aims.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/u8R2pcQCWZhl2tCIZ3nu2cFF90oMWhISSXJy" alt="Image" width="637" height="761" loading="lazy"></p>
<h3 id="heading-introducing-preferences">Introducing Preferences</h3>
<p>So far, we have not included any personal preferences towards products. If we wanted to include this, we can simply introduce rewards for each product whilst still having a terminal reward that encourages the model to be below budget.</p>
<p>This can done by changing the calculation for return to:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/8S-ifGXjN3WYCyYpGgC2LYU2EJukmVB5XUo2" alt="Image" width="627" height="110" loading="lazy"></p>
<p>So why is our return calculation now like this?</p>
<p>Well firstly, we still want our combination to be below budget so we provide the positive and negative rewards for being above and below budget respectively.</p>
<p>Next, we want to account for the reward of each product. For our purposes, we define the rewards to be a value between 0 and 1. MC return is formally calculated using the following:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1vJAlBcCYSLMbG41EbcNi5TkW191DHCXR2hz" alt="Image" width="141" height="59" loading="lazy"></p>
<p>γ is the discount factor and this tells us how much we value later steps compared to earlier steps. In our case, all actions are equally as important to reaching the desired outcome of being below budget so we set γ=1.</p>
<p>However, to ensure that we reach the primary goal of being below budget, we take the average of the sum of the rewards for each action so that this will always be less than 1 or -1 respectively.</p>
<p>Again, the full model can be found in the <a target="_blank" href="https://www.kaggle.com/osbornep/reinforcement-learning-for-meal-planning-in-python/notebook">Kaggle</a> kernel but is too large to link here.</p>
<h3 id="heading-introducing-preferences-using-rewards">Introducing Preferences using Rewards</h3>
<p>Say we decided we wanted product a1 and b2, we could add a reward to each. Let us see what happens if we do this in the output and graphs below. We have changed out budget slightly as a1 and b2 add up to £21 which means there is no way to select two more products that would put it below a budget of £23.</p>
<p>Applying a very high reward forces the model to pick a1 and b2 then work around to find products that will put it under our budget.</p>
<p>I have kept in the comparison between the cheapest products and the rest to show that the model now is not valuing the cheapest once more. Instead we get the output a1, b2, c1 and d3 which has a total cost of £25. This is both below our budget and includes our preferred products.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/4CBCAhqhP1HzYgZgYyK4ugbTUHHdCHl0uEg0" alt="Image" width="636" height="768" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/Szh6HaR-2PnC7tACxh9OI4B3z5YMUpEo-vyP" alt="Image" width="350" height="471" loading="lazy"></p>
<p>Let’s try one more reward signal. This time, I give some reward to each but want it to provide the best combination from my rewards that still keeps us below budget.</p>
<p>We have the following rewards:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/H-EJrGUSdqRcbj5QpAQ29VxP8YbGiGF8Jvyy" alt="Image" width="110" height="234" loading="lazy"></p>
<p>Running this model a few times shows that it would:</p>
<ul>
<li>Often select a1 as this has a much higher reward</li>
<li>Would always pick c1, as the rewards are the same but it is cheaper</li>
<li>Had a hard time selecting between b1 and b2 as the rewards are 0.5 and 0.6 but the costs are £8 and £11 respectively</li>
<li>Would typically select d3 as being significantly cheaper than d1 even though reward is slightly less</li>
</ul>
<p><img src="https://cdn-media-1.freecodecamp.org/images/f7kOKIZRdMFsAY5DgTUEK9xb5JOkpMfKjn5n" alt="Image" width="639" height="762" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1e31yEJBL6uufYC8-ByEst2QyrZMCx-gFggI" alt="Image" width="374" height="464" loading="lazy"></p>
<h3 id="heading-conclusion">Conclusion</h3>
<p>We have managed to build a Monte Carlo Reinforcement Learning model to:</p>
<ol>
<li>recommend products below a budget,</li>
<li>recommend the cheapest products, and</li>
<li>recommend the best products based on a preference that is still below a budget.</li>
</ol>
<p>Along the way, we have demonstrated the effect of changing parameters in reinforcement learning and how understanding these enables us to reach a desired result.</p>
<p>There is much more that we could do, in my mind, the end goal would be to apply to a real recipe and products from a supermarket where the increased number of ingredients and products need to be accounted for.</p>
<p>I created this sample data and problem to better my understanding of Reinforcement Learning and hope that you find it useful.</p>
<p>Thanks for reading!</p>
<p>Sterling Osborne</p>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
