Improving a Machine Learning model is an iterative process. No machine learning model is ever perfect on the first attempt at training it. It usually takes a few iterations.

But a common mistake you might make as a Machine Learning practitioner is failing to realize that you need to evaluate (or measure) your model to improve it. Model evaluation is an integral part of the machine learning model improvement and development process.

It helps to find the best model that represents your data and how well the chosen model will work in the future.

In a bid to enable Machine Learning engineers to look at the performance of their models at a deeper level, Google created TensorBoard.

What is TensorBoard?

At its core, TensorBoard provides the measurements and visualizations you need for your machine learning workflow. It lets you track experiment metrics like loss and accuracy, visualizing the model graph, projecting embeddings to a lower dimensional space, and much more.

TensorBoard uses graph concepts to express the data flow and the model's operations. Even though it allows you to visualize the parameters and the graph structures of big and complex models, its interface is quite simple and very intuitive.

In this tutorial, you will analyze and evaluate results on a trained machine learning model. The model you will use will be trained for a MNIST handwritten digits dataset. It uses the MNIST (Modified National Institute of Standards and Technology) database, which contains a large collection of handwritten digits. This dataset is commonly used for training various image processing system.


To complete this tutorial, you will need:

  1. Fundamental understanding of the workings of Machine Learning models.
  2. A new Google Colab notebook to run the Python code in your Google Drive. You can set this up by following this tutorial.

Step 1 – How to Set Up TensorBoard

It is important to note that TensorBoard, which is installed with TensorFlow, does not need to be installed using pip in this set up. This is the case because when you create a new notebook on Google Colab, TensorFlow is already pre-installed and optimized for the hardware being used.

A blank (new) notebook in dark mode 

With your Google Colab notebook ready, start by loading the tensorboard extension using the %load_ext magic in your notebook.

%load_ext tensorboard

After doing this, import the necessary libraries (that is, tensorflow and datetime) like this:

import tensorflow as tf
import datetime

With this, you have successfully installed TensorBoard and set it up. You can now get started.

Step 2 – How to Create and Train the Model

The dataset you will use for this tutorial is the MNIST dataset which consists of 60,000 small square 28×28-pixel grayscale images of handwritten single digits between 0 and 9.

The dataset, which we get from the Keras dataset library, is often used to train digit recognition machine learning models.

Start by creating an instance of the dataset and naming it mnist.

mnist = tf.keras.datasets.mnist

Then split the data into train sets and test sets like this:

(x_train, y_train),(x_test, y_test) = mnist.load_data()

Also, you need to standardize all the values of your train and test sets. This implies normalizing the image to the [0,1] range.

x_train, x_test = x_train / 255.0, x_test / 255.0

Then define a function that will define the machine learning model to train the dataset. You will use the Sequential Keras model. At its core, it groups a linear stack of layers into tf.keras.Model whilst providing training and inference features on this model.

def create_model():
  return tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')

It is worthy of note that,

  • The .Flatten layer flattens the input without affecting the batch size. The input shape in this example is 28 x 28.
  • The first .Dense layer is just your regular densely connected NN layer. The activation function used is 'relu' and the dimensionality of its output space is 512.
  • The .Dropout layer drops some of the input with the fraction of the input units dropped in this tutorial given as 0.2.
  • Like the first one, the second .Dense layer is also your regular densely connected NN layer. The activation function we're using is 'softmax' and the dimensionality of its output space is ten.

Then you need to call the defined model (or function) like this:

model = create_model()

With the defined function called, you can train the model with suitable parameters.


Using the datatime library you previously imported, place the logs in a timestamped subdirectory to allow easy selection of different training runs.

The logs are important because the TensorBoard will read from the logs to display the various visualizations.

log_dir = "logs/fit/" +"%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

Finally, you will train the machine learning model., 
          validation_data=(x_test, y_test), 

Step 3 – How to Evaluate the Model

To start TensorBoard within the Google Colab notebook using magics, run the code below:

%tensorboard --logdir logs/fit

You will see the TensorBoard come up.


You can now view dashboards shown on tabs at the top to evaluate your machine learning models and improve them accordingly.

Step 4 – How to Improve the Model

The point of evaluating your Machine Learning model is to gain better insight to improve the algorithm. With these visuals, you can now see the in-depth performance of the model.

  • The Scalars dashboard demonstrates how the metrics and loss fluctuate with each epoch. You can also use it to monitor other scalar values, such as training efficiency and learning rate.
  • The Graphs dashboard helps you visualize your model. You can also check that it is constructed properly by looking at the Keras graph of layers.
The Graph with the tensorboard

You can see the graph alone better here:

The graph shown alone
  • The Distributions and Histograms dashboards show the distribution of a Tensor over time. This might help you see the weights and biases and make sure they are changing as you would anticipate.

To improve this model, you will adjust the number of epochs from 3 to 6 and see how the model performs.

In general, the number of epochs usually indicates the number of passes of the entire training dataset the machine learning algorithm has completed.

Intuitively, increasing this number increases the performance of your machine learning model. To do this, you will run the code as follows:, 
          validation_data=(x_test, y_test), 

With the change we made, you can then generate another TensorBoard like this:

%tensorboard --logdir logs/fit

From the newly generated visuals, you can see that there is a remarkable improvement in the model's performance.


In this article, you learned how you can use TensorBoard to inspect and improve your ML model's performance.

If at this point you have questions about the difference between TensorBoard and TensorFlow Metrics Analysis (TFMA), this is a valid concern. After all, both are tools for providing the measurements and visualizations needed during the Machine Learning workflow.

But it is important to note that you use each of these tools in different stages of the development process. At its core, TensorBoard is used to analyze the training process itself, while TFMA is concerned with the analysis of the 'finished' trained model.

Thank you for reading!