Do you have a favorite fictional or non-fictional character? Let's build an AI model that speaks just like your favorite character. Then we'll deploy that model as a chatbot on Discord, a popular messaging platform.

What's great about this tutorial is that all the coding and deployment happen in the cloud (for free!). So you don't even need a local IDE to follow along. You can follow along using the code in my GitHub repository:

RuolinZheng08/twewy-discord-chatbot
Discord AI Chatbot using DialogGPT, trained on the game transcript of The World Ends With You - RuolinZheng08/twewy-discord-chatbot

You can also check out this tutorial on YouTube:

Outline of this Tutorial

There are four steps that we need to tackle:

  1. Gather text data from your favorite character to train the model
  2. Train a DialoGPT model in Google Colab using its free GPU
  3. Host the model on Hugging Face's Model Hub, which provides an API for us to query the model
  4. Build a Discord bot in Repl.it

How to Prepare the Data

For our chatbot to learn to converse, we need text data in the form of dialogues.This is essentially how our chatbot is going to respond to different exchanges and contexts.

There are a lot of interesting datasets on Kaggle for popular cartoons, TV shows, and other media. For example:

I choose a character named Joshua from my favorite adventure video game, The World Ends With You (TWEWY). The game has a ton of awesomely rich and quirky dialogues so I am confident that my chatbot will develop a unique personality :)

TWEWY Gameplay Screenshot. Joshua (Left) is quite a snarky character.

With some simple parsing and text cleaning, I turn the game's transcript into a dataset similar to the other Kaggle datasets.

TWEWY Game Script
# DatasetThis dataset was created by Lynn Zheng # ContentsIt contains the following files:
Kaggle Dataset

As a rule of thumb, 1000+ lines of dialogue for your character should be enough. Now let's proceed to the exciting part – training the model.

How to Train the Model

This step is largely inspired by a Medium post on training a conversational model using transcripts from Rick and Morty. We will use its Google Colab notebook for training as well.

Under the hood, the model is a Generative Pre-trained Transfomer (GPT), the most popular language model these days.

Instead of training from scratch, we will load Microsoft's pre-trained GPT DialoGPT-small and fine-tune it using our dataset. For more theory on how GPT works, do refer back to the Medium post.

In the notebook, all we need to change is the dataset we use. To prepare the data for training, the code converts our data into a contexted format shown below.

df = pd.read_csv(...) # your data here

# the following lines are already in the notebook
contexted = []
n = 7 # the context window size

for i in range(n, len(df['line'])):
  row = []
  prev = i - 1 - n # we additionally substract 1, so each row will contain the current response and 7 previous responses  
  for j in range(i, prev, -1):
    row.append(df['line'][j])
  contexted.append(row)
Contexted Data

Running through the entire notebook using Google Colab's free GPU will create a folder named output-small that contains the model file in /content/drive/My Drive/Colab Notebooks.  I have about 700 lines and the training takes less than ten minutes.

Feel free to train a larger and smarter model like DialoGPT-medium or even DialoGPT-large, as well as increase the number of training epochs by searching for num_train_epochs in the notebook.

How to Host the Model

We will host the model on Hugging Face, which provides a free API for us to query the model. Note that the free account does have a 30k input characters per month limit.

Sign up for Hugging Face and create a new model repository by clicking on New model. Obtain your API token by going to Edit profile > API Tokens.

Back in Google Colab, in the same notebook, we will push the model and the tokenizer to the Model Hub.

# install Git Large File Storage for our large model files
!sudo apt-get install git-lfs

# use the same email and name you did when signing up for huggingface.co
!git config --global user.email "YOUR EMAIL"
!git config --global user.name "YOUR NAME"

# push the model and the tokenizer
model.push_to_hub('YOUR MODEL REPO NAME', use_auth_token='YOUR API TOKEN')
tokenizer.push_to_hub('YOUR MODEL REPO NAME', use_auth_token='YOUR API TOKEN')

Then you should be able to go to your model page in a web browser and see that the model is pushed.

For our chatbot model to be recognized as a conversational model, we need to add a tag to the Model Card README.md before any markdown.

---
tags:
- conversational
---

# My Awesome Model

Here is my model:

r3dhummingbird/DialoGPT-medium-joshua · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

You can start chatting with the model in the browser! Try to see if the chatbot has learned the name of your chosen character :)

It's a huge relief that my chatbot at least knows his name 😌

Now let's move beyond chatting in the browser and deploy the chatbot model as a Discord bot.

How to Build the Chatbot

This previous freeCodeCamp blog post about building a Discord chatbot is a good place to start.

Go to the Discord Developer's page, create an application, and add a bot to it. Since our chatbot is only going to respond to user messages, checking General Permissions > View Channels and Text Permissions > Send Messgaes in the Bot Permissions Setting is sufficient. Copy the bot's API token for later use.

Sign up for Repl.it and create a new Python Repl.

Let's store our API tokens for Hugging Face and Discord as environment variables, named HUGGINGFACE_TOKEN and DISCORD_TOKEN respectively. This helps keep them secret.

Set up environment variables

In main.py, we will create our Discord bot. I will break down the code step by step.

Imports

# the os module helps us access environment variables
# i.e., our API keys
import os

# these modules are for querying the Hugging Face model
import json
import requests

# the Discord Python API
import discord

Hugging Face API Endpoint

# this is my Hugging Face profile link
API_URL = 'https://api-inference.huggingface.co/models/r3dhummingbird/'

Discord Bot Client Definition

class MyClient(discord.Client):
    def __init__(self, model_name):
        super().__init__()
        self.api_endpoint = API_URL + model_name
        # retrieve the secret API token from the system environment
        huggingface_token = os.environ['HUGGINGFACE_TOKEN']
        # format the header in our request to Hugging Face
        self.request_headers = {
            'Authorization': 'Bearer {}'.format(huggingface_token)
        }

    def query(self, payload):
        """
        make request to the Hugging Face model API
        """
        data = json.dumps(payload)
        response = requests.request('POST',
                                    self.api_endpoint,
                                    headers=self.request_headers,
                                    data=data)
        ret = json.loads(response.content.decode('utf-8'))
        return ret

    async def on_ready(self):
        # print out information when the bot wakes up
        print('Logged in as')
        print(self.user.name)
        print(self.user.id)
        print('------')
        # send a request to the model without caring about the response
        # just so that the model wakes up and starts loading
        self.query({'inputs': {'text': 'Hello!'}})

    async def on_message(self, message):
        """
        this function is called whenever the bot sees a message in a channel
        """
        # ignore the message if it comes from the bot itself
        if message.author.id == self.user.id:
            return

        # form query payload with the content of the message
        payload = {'inputs': {'text': message.content}}

        # while the bot is waiting on a response from the model
        # set the its status as typing for user-friendliness
        async with message.channel.typing():
          response = self.query(payload)
        bot_response = response.get('generated_text', None)
        
        # we may get ill-formed response if the model hasn't fully loaded
        # or has timed out
        if not bot_response:
            bot_response = 'Hmm... something is not right.'

        # send the model's response to the Discord channel
        await message.channel.send(bot_response)

Main Function

def main():
    # DialoGPT-medium-joshua is my model name
    client = MyClient('DialoGPT-medium-joshua')
    # retrieve the secret API token from the system environment
    client.run(os.environ['DISCORD_TOKEN'])

if __name__ == '__main__':
    main()

How to Deploy the Bot to Discord

With that, our bot is ready to go! Start the Repl script by hitting Run, add the bot to a server, type something in the channel, and enjoy the bot's witty response.

Me: Tell me something philosophical
JoshuaBot: How can mirrors be real if our eyes aren't real?

Well, as expected from my favorite quirky video game character :D See the end of this post for a 15-minute video of a real-time chat among me, my friend, and my bot.

Chatting with my bot in Discord. I drew his icon :D

How to Keep the Bot Online

One little problem with our bot is that it halts as soon as we close the Repl.it script. There are two ways around this.

If you have a Repl.it Hacker plan that comes with University email addresses, then you can set the Repl project to Always On.

Otherwise, you can wrap your Python script in a Flask Web App and utilize services like Uptime Robot to ping your web app every five minutes to keep it awake.

You can read the last section in this freeCodeCamp Discord bot tutorial to learn more.

Final Thoughts and Resources

Cheers! You have reached the end of this tutorial. Hope you had fun creating Discord bots.

As a side note, my biggest takeaway from this project was perhaps not the technical part but rather about when to pivot and redefine a project. Although the user-facing component of this project turns out to be a Discord bot, my original vision was to build a React.js front-end for the chatbot.

However, as soon as I saw the slick UI chat on Hugging Face, I realized that's something I can never dream to build with just a bunch of React.js crash courses.

So I decided to keep my backend but change my front-end/user-facing component to something more unique and interesting – a Discord bot that never fails to entertain me and my friends on our chat server. :)

Resources

RuolinZheng08/twewy-discord-chatbot
Discord AI Chatbot using DialogGPT, trained on the game transcript of The World Ends With You - RuolinZheng08/twewy-discord-chatbot
TWEWY Game Script
# DatasetThis dataset was created by Lynn Zheng # ContentsIt contains the following files:
r3dhummingbird/DialoGPT-medium-joshua · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.

A 15-minute real-time chat among me, my friend, and my bot:

And here's a JavaScript version if you want to try building the bot in JS: