by ahmad abdolsaheb

# How to make your Tic Tac Toe game unbeatable by using the minimax algorithm

I struggled for hours scrolling through tutorials, watching videos, and banging my head on the desk trying to build an unbeatable Tic Tac Toe game with a reliable Artificial Intelligence. So if you are going through a similar journey, I would like to introduce you to the Minimax algorithm.

Like a professional chess player, this algorithm sees a few steps ahead and puts itself in the shoes of its opponent. It keeps playing ahead until it reaches a terminal arrangement of the board (**terminal state**) resulting in a tie, a win, or a loss. Once in a terminal state, the AI will assign an arbitrary positive score (+10) for a win, a negative score (-10) for a loss, or a neutral score (0) for a tie.

At the same time, the algorithm evaluates the moves that lead to a terminal state based on the players’ turn. It will choose the move with maximum score when it is the AI’s turn and choose the move with the minimum score when it is the human player’s turn. Using this strategy, Minimax avoids losing to the human player.

Try it for yourself in the following game.

A Minimax algorithm can be best defined as a recursive function that does the following things:

- return a value if a terminal state is found (+10, 0, -10)
- go through available spots on the board
- call the minimax function on each available spot (recursion)
- evaluate returning values from function calls
- and return the best value

If you are new to the concept of recursion, I recommend watching this video from Harvard’s CS50.

To completely grasp the Minimax’s thought process, let’s implement it in code and see it in action in the following two sections.

### Minimax in Code

For this tutorial you will be working on a near end state of the game which is shown in figure 2 below. Since minimax evaluates every state of the game (hundreds of thousands), a near end state allows you to follow up with minimax’s recursive calls easier(9).

For the following figure, assume the AI is X and the human player is O.

To work with the Ti Tac Toe board easier, you should define it as an array with 9 items. Each item will have its index as a value. This will come handy later on. Because the above board is already populated with some X and Y moves, let us define the board with the X and Y moves already in it (*origBoard*).

`var origBoard = ["O",1,"X","X",4,"X",6,"O","O"];`

Then declare *aiPlayer *and* huPlayer *variables and set them to “X” and “O” respectively*.*

Additionally, you need a function that looks for winning combinations and returns true if it finds one, and a function that lists the indexes of available spots in the board.

Now let’s dive into the good parts by defining the Minimax function with two arguments *newBoard* and *player*. Then, you need to find the indexes of the available spots in the board and set them to a variable called *availSpots*.

Also, you need to check for terminal states and return a value accordingly. If O wins you should return -10, if X wins you should return +10. In addition, if the length of the *availableSpots* array is zero, that means there is no more room to play, the game has resulted in a tie, and you should return zero.

Next, you need to collect the scores from each of the empty spots to evaluate later. Therefore, make an array called *moves* and loop through empty spots while collecting each move’s index and score in an object called *move*.

Then, set the index number of the empty spot that was stored as a number in the *origBoard* to the index property of the *move* object. Later, set the empty spot on the *newboard* to the current player and call the *minimax *function with other player and the newly changed *newboard*. Next, you should store the object resulted from the *minimax* function call that includes a *score* property to the *score* property of the *move* object.

If the minimax function does not find a terminal state, it keeps recursively going level by level deeper into the game. This recursion happens until it reaches a terminal state and returns a score one level up.

Finally, Minimax resets *newBoard* to what it was before and pushes the *move* object to the *moves *array.

Then, the minimax algorithm needs to evaluate the best* move* in the *moves* array. It should choose the *move* with the highest score when AI is playing and the *move* with the lowest score when the human is playing. Therefore, If the *player* is *aiPlayer*, it sets a variable called *bestScore *to a very low number and loops through the *moves *array, if a *move* has a higher *score* than *bestScore*, the algorithm stores that *move*. In case there are moves with similar score, only the first one will be stored.

The same evaluation process happens when *player* is *huPlayer*, but this time *bestScore* would be set to a high number and Minimax looks for a move with the lowest score to store.

At the end, Minimax returns the object stored in *bestMove*.

That is it for the minimax function. :) you can find the above algorithm on github and codepen. Play around with different boards and check the results in the console.

In the next section, let’s go over the code line by line to better understand how the minimax function behaves given the board shown in figure 2.

### Minimax in action

Using the following figure, let’s follow the algorithm’s function calls (**FC**) one by one.

Note: In figure 3, large numbers represent each function call and levels refer to how many steps ahead of the game the algorithm is playing.

*1.**origBoard *and *aiPlayer* is fed to the algorithm. The algorithm makes a list of the three empty spots it finds, checks for terminal states, and loops through every empty spot starting from the first one. Then, it changes the *newBoard* by placing the *aiPlayer *in the first empty spot.* *After that,* *it calls itself with *newBoard *and the *huPlayer *and waits for the FC to return a value.

**2. **While the first FC is still running, the second one starts by making a list of the two empty spots it finds, checks for terminal states, and loops through the empty spot starting from the first one. Then, it changes the *newBoard* by placing the *huPlayer *in the first empty spot.* *After that* *it calls itself with *newBoard *and the *aiPlayer *and waits for the FC to return a value.

**3. **Finally the algorithm makes a list of the empty spots, and finds a win for the human player after checking for terminal states. Therefore, it returns an object with a score property and value of -10.

Since the second FC listed two empty spots, Minimax changes thenewBoardby placinghuPlayerin the second empty spot. Then, it calls itself with the new board and theaiPlayer.

**4.** The algorithm makes a list of the empty spots, and finds a win for the human player after checking for terminal states. Therefore, it returns an object with a score property and value of -10.

On the second FC, the algorithm collects the values coming from lower levels (3rd and 4th FC). SincehuPlayer’s turn resulted in the two values, the algorithm chooses the lowest of the two values. Because both of the values are similar, it chooses the first one and returns it up to the first FC.

At this point the first FC has evaluated the score of movingaiPlayerin the first empty spot. Next, it changes thenewBoardby placingaiPlayerin the second empty spot. Then, it calls itself with thenewBoardand thehuPlayer.

**5.** On the fifth FC, The algorithm makes a list of the empty spots, and finds a win for the human player after checking for terminal states. Therefore, it returns an object with a score property and value of +10.

After that, the first FC moves on by changing thenewBoardand placingaiPlayerin the third empty spot. Then, it calls itself with the new board and thehuPlayer.

**6.** The 6th FC starts by making a list of two empty spots it finds, checks for terminal states, and loops through the two empty spots starting from the first one. Then, it changes the *newBoard* by placing the *huPlayer *in the first empty spot.* *After that,* *it calls itself with *newBoard *and the *aiPlayer *and waits for the FC to return a score.

**7.** Now the algorithm is two level deep into the recursion. It makes a list of the one empty spot it finds, checks for terminal states, and changes the *newBoard* by placing the *aiPlayer *in the empty spot.* *After that,* *it calls itself with *newBoard *and the *huPlayer *and waits for the FC to return a score so it can evaluate it.

**8. **On the 8th FC,** **the algorithm makes an empty list of empty spots, and finds a win for the *aiPlayer* after checking for terminal states. Therefore, it returns an object with score property and value of +10 one level up (7th FC).

The 7th FC only received one positive value from lower levels (8th FC). BecauseaiPlayer’s turnresulted in that value, the algorithm needs to return the highest value it has received from lower levels. Therefore, it returns its only positive value (+10) one level up (6th FC).

Since the 6th FC listed two empty spots, Minimax changesnewBoardby placinghuPlayerin the second empty spot. Then, calls itself with the new board and theaiPlayer.

**9. **Next, the algorithm makes a list of the empty spots, and finds a win for the *aiPlayer* after checking for terminal states. Therefore, it returns an object with score properties and value of +10.

At this point, the 6 FC has to choose between the score (+10)that was sent up from the 7th FC (returned originally from from the 8 FC) and the score (-10) returned from the 9th FC. SincehuPlayer’s turn resulted in those two returned values, the algorithm finds the minimum score (-10) and returns it upwards as an object containing score and index properties.

Finally, all three branches of the first FC have been evaluated ( -10, +10, -10). But because aiPlayer’s turn resulted in those values, the algorithm returns an object containing the highest score (+10) and its index (4).

**In the above scenario, Minimax concludes that moving the X to the middle of the board results in the best outcome. :)**

### The End!

By now you should be able to understand the logic behind the Minimax algorithm. Using this logic try to implement a Minimax algorithm yourself or find the above sample on** **github or codepen and optimize it.

*Thanks for reading! If you liked this story, please recommend it by clicking the ❤ button on the side and sharing it on social media.*

*Special thanks to Tuba Yilmaz, Rick McGavin, and Javid Askerov* *for reviewing this article.*