Handwritten digit recognition is one of the classic “Hello World” projects in deep learning. Thanks to TensorFlow and GPU acceleration, we can build a highly accurate model in minutes.

In this guide, we’ll walk through setting up a Convolutional Neural Network (CNN) on an Ubuntu GPU server to recognize digits from the famous MNIST dataset and even test it on your own handwritten images.

Prerequisites

  • An Ubuntu 24.04 server with an NVIDIA GPU.
  • A non-root user with sudo privileges.
  • NVIDIA drivers installed.

Step 1: Set Up the Project Environment

Before diving into coding, we must prepare our GPU-powered Ubuntu server with the right tools.

1. First, let’s install Python and set up an isolated environment to avoid dependency conflicts:

apt install -y python3 python3-pip python3-venv

2. Now, create and activate a virtual environment:

python3 -m venv tf-gpu-env
source tf-gpu-env/bin/activate

3. TensorFlow can leverage NVIDIA CUDA for lightning-fast training. Install it along with NumPy and Matplotlib:

pip install tensorflow numpy matplotlib

4. Connect to the Python shell.

python3

Verify that TensorFlow recognizes your GPU:

import tensorflow as tf
print("Num GPUs Available:", len(tf.config.list_physical_devices('GPU')))

Output.

Num GPUs Available: 1

5. Press CTRL+D to exit from the Python shell.

Step 2: Build and Train the Model

Now, the fun part is building a Convolutional Neural Network (CNN) to recognize handwritten digits!

1. Create the training script.

nano train_mnist.py

Add the following code:

import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
import os

# Load and preprocess data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0  # Normalize
x_train = x_train[..., tf.newaxis]  # Add channel dimension (28x28x1)
x_test = x_test[..., tf.newaxis]

# Build CNN model
model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    MaxPooling2D((2,2)),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),  # Prevent overfitting
    Dense(10, activation='softmax')  # 10 output classes (digits 0-9)
])

# Compile model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train model
history = model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))

# Evaluate model
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f"\nTest Accuracy: {test_acc*100:.2f}%")

# Save model
os.makedirs("models", exist_ok=True)
model.save("models/mnist_cnn.h5")
print("Model saved to 'models/mnist_cnn.h5'")

2. Run the script.

python3 train_mnist.py

Output.

Test accuracy: 0.9925000071525574

Step 3: Test the Model on New Data

Let’s see how well our model performs on unseen digits from the MNIST test set.

1. Create a prediction script.

nano predict.py

Add the following code.

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import random

# Load saved model
model = tf.keras.models.load_model('models/mnist_cnn.h5')

# Load test data
(_, _), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_test = x_test / 255.0
x_test = x_test[..., tf.newaxis]

# Make predictions
def predict_random_sample():
    index = random.randint(0, len(x_test))
    image = x_test[index]
    prediction = model.predict(image[np.newaxis, ...])
    predicted_label = np.argmax(prediction)
    
    plt.imshow(image.squeeze(), cmap='gray')
    plt.title(f'Predicted: {predicted_label}, Actual: {y_test[index]}')
    plt.show()

# Predict 5 random samples
for _ in range(5):
    predict_random_sample()

2. Run the prediction.

python3 predict.py

Output.

1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 563ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 23ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 22ms/step

Step 4: Test on Custom Handwritten Images

Now, let’s try using your own handwriting.

1. Create an image named hand.png with a white digit (3) on a black background.

2. Create a script including your own handwritten image.

nano predic_image.py

Add the following code.

import tensorflow as tf
from PIL import Image
import numpy as np

# Load the trained model
model = tf.keras.models.load_model('models/mnist_cnn.h5')  # Adjust path if needed

def preprocess_custom_image(image_path):
    img = Image.open(image_path).convert('L')  # Grayscale
    img = img.resize((28, 28))  # Resize to MNIST dimensions
    img_array = np.array(img) / 255.0  # Normalize
    img_array = img_array.reshape(1, 28, 28, 1)  # Reshape for model
    return img_array

# Predict
custom_img = preprocess_custom_image("hand.png")  # Replace with your image path
prediction = model.predict(custom_img)
print("Predicted Digit:", np.argmax(prediction))

3. Run the script.

python3 predic_image.py

The script predicts your handwritten digit in the image and gives the output.

Predicted Digit: 3

Conclusion

You’ve now built a neural network that can recognize handwritten digits with high accuracy using TensorFlow on an Ubuntu GPU server. The GPU acceleration significantly reduces training time, allowing faster experimentation with different architectures and hyperparameters.