You can use artificial intelligence to make a Discord chat bot talk like your favorite character. It could be a Rick and Morty character, Harry Potter, Peppa Pig, or someone else.
Lynn Zheng created this course. Lynn is a software engineer at Salesforce and a hobbyist game developer. She is also a great teacher!
Here are the sections in this course:
- Gather data
- Train the model
- Deploy the model
- Build the Discord bot in Python
- Keep the bots online
Creating this awesome bot will make you feel like a programming wizard.
Watch the course below or on the freeCodeCamp.org YouTube channel (1-hour watch).
Want to make a discord bot that talks like characters from Rick and Morty or Harry Potter? Maybe you want to make a talk like another favorite character.
In this course Lynn will show you how to create a discord bot that uses artificial intelligence to talk like a character of your choice.
Hi there. I'm Lynn. I'm a software engineer, hobbyist game developer and recent graduate from the University of Chicago.
In this tutorial, we're going to build a discord AI chat bot that can speak like your favorite character.
Before we start, if you haven't seen that video, by any chance, know that this video is something different and more comprehensive.
So keep watching to the end, the original tutorial and my first discord bot started as a joke between me and my friends when we were playing video games, and it really surprised me how popular it became and a lot of people wanted to build their own bot based on that tutorial.
Therefore, I decided to update that tutorial to include more characters, as well as to show you how to find data for your favorite character.
Other topics that we will cover here include, but are not limited to how to train the model, how to deploy the model, come the errors you might see during this model, training and deployment pipeline and how to solve them.
Lastly, we will cover how to properly deploy the bot to a Discord server and to limit it to certain channels, and how to keep the bot running indefinitely.
I hope you're excited.
And let's jump into the tutorial.
First, we're going to find data for our character.
My favorite sources aren't kaggle transcript wiki, and just random fandom websites that came out from a Google search.
And my research process goes like this, I first search on kaggle to see if there are pre made dialog data set.
For example, if we search for Rick and Morty, we get this nicely formatted data set that includes the name of the character and the lines are speaking.
If we search for Harry Potter, here is another data set that includes the character and the sentence they're speaking.
Since we're building a chatbot, we need only these two columns in our data, set the character name and align their speaking.
So these dialog data sets on kaggle are perfect for our requirement.
Alright, if we succeeded in finding some data on kaggle, we can move on to the model training step.
But what if we cannot find a data set for our character on kaggle.
For example, if I want to find a data set for Peppa Pig, it looks like there is no data set for the character.
In this case, we may need to find a raw transcript of the media, be it a video game, a cartoon on a show.
And I've found that transcript wiki has some great resources.
So here we have a list of movies shows, video games, musicals commercials.
For example, I was able to find the transcript for Peppa Pig.
And also movies like Batman on transcript wiki.
The transcript looks like this.
So we have their character name and their actions or the lines they're speaking.
We will see shortly how to turn a raw transcript like this into a data set like those we saw on kaggle.
Besides transcript wiki, you may also just Google the name of your media with the keyword transcript.
For example, my first thought was based on the game, the word ends with you.
And it has no results either on kaggle or transcript wiki.
So what I did was to just Google the name of the game and game transcript.
And it just happened that this fandom website has the full game transcript.
So Be sure to utilize your Google Search skills to find data for your character.
If rather than a fictional character, you are more interested in real life character, you may search for interview scripts as your data source.
If you want to create a chatbot that speaks like yourself or your friends, you can treat text messages between you and your friends as dialogues, and handcraft, your data set.
There are tons of ways to get data for your character.
So be creative.
And now we will look at how to turn raw transcripts into data set.
Now, suppose we have found our raw transcript, let's see how we can turn it into a two column character line data set.
Suppose we take this Peppa Pig transcripts and copy them into a text file.
Now we go to Google colab and upload our data file.
Now we create a Google colab notebook.
And use those to parse our script.
So I'm going to name this parse script ipnb.
And we are going to from Google colab can import drive and then called drive about mount content drive.
This will allow us to read the data from our Google Drive.
Alright, now that our drive has been mounted, let's import OS and then OS change directory into content, drive by drive.
And then we see if there's anything in it.
Yeah, we have our Peppa Pig dot txt.
So here we are going to import regular expression to parse our transcript, put the parsed result into a panda's data frame, and export it as a CSV file just like those we saw on kaggle.
This is going to be our regular expression pattern.
You don't have to be a pro a regular expression to understand this part.
If we take the pattern to this site, and our testing is Peppa Pig you will see that we have two match capture groups, the first being the character's name, the second being the line spoken.
And also for the second line mama pig, we have the character name and the line being spoken.
Right so that's all right, regular expression.
Now, let's define a dictionary that will store our data.
So we know that we need the column name and the column line in our result data frame and then we open and read the file For each line, we match it with our regular expression pattern.
If there is a match, we extract name and light from this regular expression match, and then append it to our dictionary here.
And then we convert this dictionary into a data frame.
Now we can inspect the data frame, quote.
So we have the name is Peppa Pig saying that I'm Peppa Pig.
And George makes the sound that mama pig makes the sound great.
We can also count the number of lines that belong to our character.
So we do some the F name is equal to Peppa Pig.
And we saw that Peppa Pig has 38 lines in our entire data frame.
So the length of our data frame is over 100 and Peppa Pig has 1/3 of the lines.
The last step will be to export the data frame.
So df.to, CSV, the name will be Peppa Pig.
csv and and we will drop the index.
Now we should have a Peppa Pig dot CSV in overdrive.
And hear this name and line.
This is how we parse those raw transcripts into a file that can be used in our model training.
So next, let's proceed on to the exciting step model training.
Now we're going to train the model.
Go to my GitHub repository linked to in the description below and download the content.
We're going to use those model train upload workflow the IP one MB which looks like this.
Alright, now our file has been downloaded, we unzip the content.
And in here we have a model training upload workflow dot ipnb.
We upload this notebook file to Google Drive and open it in Google colab.
We are going to train a GPT model which is short for generative pre trained transformer.
In the runtime change runtime type.
Make sure to select GPU because this will accelerate our model training.
So now here we mount the drive.
We install the Transformers module that we'll be using.
And we change directory into my drive.
Here are all the modules that we are importing.
If we're using the data set from kaggle, we need to obtain our API key from kaggle.
So go to our kaggle profile, go to account.
Scroll down to the API key section.
Create a new API token and download this file as kaggle dot JSON.
We'll go back to Google Drive and upload kaggle dot JSON.
Now we can download our data set from kaggle, we're going to use the Harry Potter data set as the example.
So grab those user name and the data set name.
And our file is here part one dot CSV.
Not not because there are white spaces, in our file name, we got special characters in the file name.
So let's inspect the contents of our data file.
Well, CSV files are usually separated by commas, hence the name CSV.
However, this one looks like it's separated by semicolons.
So we need to take care of the semi colons where we're reading the data into a panda's data frame.
So separation is semicolon instead of a comma.
So let's sample data to see what's inside.
Alright, so we have character and sentence.
Notice that these two column names aren't exactly what we need.
We want the two columns of our data frame to be named as name and line as use down here in this cell, so we need to change the name of our columns.
Alright, let's resample our data.
Looks like we have successfully changed the name of our columns.
Now let's see how big our data is.
So he only has 1000 or so lines.
And let's see how many lines our character has.
Our character has 155 lines.
So here we change our character name to Harry.
And we now run this cell to create a context data frame that includes the current line our characters speaking, and several lines directly preceding the line.
Context data frame is useful here because we are creating a conversational chatbot.
And we want to generate a response based on the conversation context.
So let's sample our context data frame.
So in the context of clarity, something something our character respond with seems a pity not to ask her.
Great, now we have our data set, which split the data set into a training set and a test set.
This is because we don't want to overfit the model.
In the case of overfitting, the model will just memorize the lines from the data set and talk back to us using the exact lines, we don't want that we want the conversation to be more organic.
So we're only training the model on a training set and evaluating the model on the test set.
So we continue running these cells to build data sets, caching checkpoints, and down here we built the model.
We'll build our model by fine tuning Microsoft's pre trained GPT small.
Small here refers to the number of parameters in the model.
There are also a medium and a large model.
Generally, the larger the model, the longer it takes to train, but the smarter the model can get.
I would recommend that training a medium model, as it's pretty smart and not too hard to change.
My production chatbot that is currently running on a server with 1000 plus users is also a medium model.
For this tutorial for the sake of time, I'm training just a small model.
You can see here it's downloading the model.
And this may take some time because it's essentially 300 megabytes.
Here are some hyper parameters that you may find useful.
For example, num train aprox is the number of training aprox.
This is defined to be four here, and this is the number of times that the model will cycle through the training set.
As long as the model is not overfitting, increasing the number of training epochs usually results in smarter models.
Because the model has more time to cycle through the data set and pick up the nitty gritty details.
There's another hyper parameter called the batch size.
This is the number of training examples that the model will see in the batch before he updates his gradient.
I wouldn't recommend changing this unless you know what you're doing, since other hyper parameters like learning rate and temperature might be sensitive to this change in batch size.
However, if you're training a larger model on a larger data set and are running into memory errors, to make the error go away a might help to decrease the batch size.
The remaining cells have been configured to taking this context data frame we've created, trained the model and save it to a folder called output small.
Now let's run this main function.
training will take some time, I trained my medium model for 12 epochs, and it took around two hours.
So do sit back and grab a snack while the model is training.
You can see the progress in the progress bars about Alright, here we get back a perplexing cancer.
This usually refers to how confused the model is.
If a model has a large complexity, it means that the model is pretty confused as to which words to choose to respond to a given situation.
And the model might not be very smart.
In our case, our data set is pretty small.
It only has 150 plus lines.
So it makes sense about the perplexity is high to decrease the perplexity we might need to train for more epochs.
But now that the training is complete, we can load and chat with the model here.
That's changed the name of the Hello, fellow read writer.
So let's ask pause.
There's no such thing as a bad read writer.
Great, it looks like our chat bots is capable of making and maintaining a conversation.
Now we can push the model to Huggy face and start building our discord chat bot.
Alright, now let's change directory just into the content folder.
Because we'll be doing our push there.
And we do pip install hugging face command line client.
And then we're logging using our credentials.
Right after logging were assigned this token, we need to grab this token Go for the song that we need to do afterwards.
So we can create a repository to store all models from the command line, my is going to be called dialogue up at small Harry Potter.
And our empty model repository is right here.
There's nothing except for the get attributes file, but we will be adding the model file soon.
Now we still get Fs, which stands for get large file storage, this wall allows us to push and pull all models.
And we've replaced this token with the takeaway just copied from the above.
So here's my username and my token.
And we call it our training result is stored in this output small directory.
And then we change directory into our dialog up the small directory because we need to do git add and permit from there, we saw the get Fs and inspect the contents our current directory, which should be dialogue GPT, small her father and also just printed out the working directory where to make sure that we are inside content.
Now we check the file status on Git.
So these files not we need to add to get.
So we do a git add this will take some time because the pytorch model dot being is pretty large.
And we configured a global username and user email.
These are just my hug and based credentials.
That would come in with message initial comment.
And finally, we'll push the model.
It's about 400 megabytes because the pytorch model is itself about 400 megabytes.
Alright, looks like the push is complete.
Now we see our pytorch model here.
However, there's one more thing that we need to do before we can converse with the model on Huggy face.
That is, you see here, it's tagged as text generation.
However, we know that we are training a chatbot model, and we want our model to be conversational.
For that purpose, we need to edit the model card.
So we create a model cart here and we're putting our desired model tax so our tax is conversational.
We come in our model card.
And now our model is correctly tagged as conversational.
If we go to the main model page, we can start chatting with the model here.
Alright, now that we have pushed our model to hugging face, we're ready to use it in our discord chat bots.
Now we have our model let's build the discord bot here on Discord.
I have my server lease dev lab.
I have two channels, one for the Python bot and one for the job as robot The reason why we have separate channel for the bots is because we don't want the bots to be talking to each other.
So after we built the bot, we will learn how to set their permissions correctly so that they don't go outside of their dedicated channel.
So we go to discord, developers page, create an application, we need one application per bot.
So our name will be chatted about Python.
So here we create a bot.
And I'm going to name this Harry Potter, bot, Python and upload an icon.
We will be using this API token here.
When we create our bot in Python, we're going to host our bot our rapida it so sign up for repple.id and create a new Python repple here going to name those chatty, but I thought and in here, we will need to store our API tokens for Huggy face underscore us environmental variables.
So here is the top for the secrets for the environment variables.
So the first one will be hugging face token.
And for the value, we'll go to our hugging face profile or the profile API tokus, copy the API token, come back here and fill in that value.
Next, we'll created this core token.
And for this value, would go to this discord developers portal and copy the token.
Three, add the token here.
And our environment variables are all set.
Next, I have the Python file in my GitHub repository called this court bot dot p y.
So we brought the code from here, and I'll explain the code line by line.
Starting from line one, we're first import the OS module that will help us reading our environment variables.
Next, we import modules that are useful for querying the Huggy face model.
Finally, we'll import the discount module.
And here I have my API URL pointing to my username.
And with the Find a bot as follows.
In the init function, it takes in a model name, which for me will be dialogue GPT small Harry Potter.
Then we store this API endpoint by concatenating this API URL, which is my profile link with the model name.
Now we retrieve the secret API token from the system environment by looking at Oh s dot environment hugging face token.
Next, we format the header, you know request to Huggy face.
For the authorization part.
We're putting bearer and the hugging face token.
Next, we'll define the quorum method that takes in the payload.
we dump the payload as a JSON string.
And use the request module to make an HTTP POST request to the API endpoint using our defined request headers, which contains our hugging face API key and passing the data.
Once the request finishes, it should give us a response object and we decode it from UTF eight and load the result as us rate and return to string.
Next, would you find an asynchronous function named already.
The next two function definitions are based on the discord API.
Both are asynchronous function.
The first one is already, this function will be called when the bot is logging in.
So when the bot is logging, we will print out logged as print out the bots name and a bot ID so that we know that the bot is functioning.
Next, because our bot is a chatbot, it needs to respond to messages.
So our message is a method that will be called each time the bot sees a message in the channel.
So given the message, if the message is coming from the bot itself, the bot ignores the message and does not reply to it.
Otherwise, it will form a Korea payload with the content of the message.
And to make the bot more user friendly.
While the bot is waiting for the HTTP response from the model, we set its status as typing so that the user will know that the bot is generating its response.
So this is a synchronous call with a message.channel.tv.
We call it soft query using the payload and get back the response.
If there is a valid generated response, there will be a generated tax field in this response.
And we'll be able to get that out as a bots response.
Otherwise, there might be an error in the response, we just log out the error message so that we can debug later on.
Finally, we use another asynchronous method to send the bots response to the channel using message dot channel dot send.
And that's it about our bot definition.
In the main function we just created about passing the model name.
So for me, this is style gptc small Harry Potter and us client that run looking up the score token from the environment variables.
Great now that our bot should be all set up that seemed like the bot to our channel.
In the OAuth two tab, we are going to select the bot.
And for the bot permissions.
The only thing you need is to send messages.
So we copy this URL, paste it in a new browser window and invited to my server.
Alright, now that we see that our bot has appeared, however, it shows as offline.
So we need to run the repple.
So we hit run here.
And rapport is installing all our dependencies and imports.
Great now that our bot has logged in as Harry Potter about Python, and this is its unique ID.
Let's go to the server.
And now that the bot is online don't want the bot to appear in the general channel.
So we go to the channel setting permissions.
Advanced permissions add a bot and we remove it permission to send messages and save the changes.
Now, let's see what happens if I type something in the general channel, nothing should happen because the bot shouldn't be able to send a message.
Nothing happens, although the bot is online.
And now this bot should work in this Python bots channel.
So let's do Hello.
And we briefly saw not there's a typing prompt.
So yeah, this is how we built a bot in Python.
One thing to know, you know, raphoe, though it is not, because we took away the bots permission to send the messages in the general channel, it is showing an exception, and this is totally okay.
If you don't like seeing this exception, you can use the try, accept block and log out this exception.
So we go back to the discord developer portal, and create a new application.
And I'll create a bot.
Back to wrap it, we create a node.js app.
going to call it chadic, thought js.
We again create two environmental variables.
The first one is hugging face token, copying my API token from my profile Edit Profile, putting here and copy my discord bot token.
Call this one this court token and adding the value.
Now we have our environment variable set up.
Let's copy paste.
And I'll go through the code line by line.
And we import fetch for making HTTP requests, just like with it in Python.
And we initialize a new discord client and define the model URL just as my user name and the model name.
So this guy is the LGBT small Harry Potter.
And this is the same callback that is called when the bot is ready, just like the already function we saw in Python.
So when the bot is ready, and logged in, we print out logged in as client user dot Tak.
And here is another callback.
This time all message, we use an asynchronous callback, because we are making HTTP requests.
Like in the Python script, we ignore the message if the message is from the bot itself.
By checking if message dot author is the bot.
Now we formed the payload.
So the payload is a dictionary containing inputs with text message content, which is the message that the bot has received.
And we formed the request headers by again, using the Huggy face API key.
So we read the Huggy face token from the environment, process dot m dot hugging face token and form the headers.
Right before we start making the HTTP request, we set the bot status to typing.
Now record a server.
So the response is the result from this call to fetch using HTTP POST, given the payload as the body and the headers using the Huggy face token.
And we convert the response into JSON format, and extract out the generated text field.
If there isn't a generate a text field in the response, but is that the response contains a narrower field.
This means that the board has encountered some errors, and we may want to print out the error for further debugging.
Now that we have the bots response, we can clear out his typing status and sent the message to the channel as a reply.
This ends the definition of our client dot our message call.
Down here, we log in using the discord token.
Now let's invite the boss to our server.
So we go to o auth.
Check the bot, it only permission is to send messages.
We copy this paste in a new browser window and invite it to our server.
Great looks like we have another bot.
Remember to click on save changes.
Otherwise, the bots icon wouldn't be showing.
Now that we have our bot, however, it's not logged in.
So we need to go back to the repple to run our script.
But before we run our repple, let's make sure that the bot doesn't have access to the general channel.
Nor does it have access to the Python channel because it's not supposed to go there.
Remove its permission to send messages.
And always remember to save the changes.
We do the same thing for it on the Python channel and go to the jazz channel.
This time, the one we need to remove is the Python bot.
Now we go back to run our raphoe dotnet.
If you see this error, this means that the discord version that NPM is trying to install is wrong.
You can see that there are those warnings that the newest discord module is not compatible with rapida its version of node, or NPM.
So we need to manually change something in package dot JSON.
So here, we just use the older version and rerun it.
Call the chat about is also online.
Let's see if a response to our messages.
So this is now an error message telling us that the model is still loading.
The model will usually take one or two minutes to load.
So let's give it some time.
Great looks like our bot is responding to us.
And because we have set the bots permissions correctly, the Python bot is not responding to any messages here.
And in our general channel, nobody is ever allowed to talk here.
One thing to note is that if I close the browser tab for the Python bot, the bot is no longer responding, although it still shows that the bot is online.
So in the next part, we're going to look at how to keep the bots running indefinitely in the browser, even when we close the browser tab.
In order to get our bot to run indefinitely, we need to create a web server in raphoe dotnet, and set up a service called uptime robot to continuously ping the web server.
So this is for the Python bot.
And we create a new file called people live dot p y.
And we add the code for a web server like this.
And in our main.py, we import that part.
And down here in the main function, right before the bot runs.
We ask it to be kept alive.
We run it.
When the code runs, we see a URL shown in this tab.
And we copy this URL and bring it to our uptime robot service.
So here is the uptime robots website.
And I already have an account.
So I'll just go to my dashboard and add a new monitor.
Monitor montra type is going to be HTTPS friendly name, this court, Python but the URL is the one we copied from here.
And monetary level will be a ping every five minutes that should be sufficient.
And finally we create, monitor and close it.
Now let's see if our Python bot is capable of running indefinitely.
Alright, I'm going to close this top containing my Python script.
And it looks like our model is still up.
It's just that after some time, the model or Huggy face backend will reload because the bot itself is responding.
We create a new file called server.js and copy paste base code waiting for import this part from the file that we just created.
Finally, right before to bought Roz, we are going to call keep alive.
Stop this service.
Alright, the server is now ready.
We copy this URL, go to uptime robot and add a new monitor.
It's again HTTP monitor this cord.
js bot, and the URL is like this.
We create a monitor.
Now we can safely close this browser window and go back to our discord chat about is still running.
Now we're all done.
I hope you enjoyed this video.
Please subscribe for more content like this and I'll see you in the next one.