TensorFlow MNIST example

Article Summary

Share feedback

Thanks for sharing your feedback!

Available in Classic and VPC

TensorFlow MNIST example

Here you will find MNIST basic examples for TensorFlow beginners and MNIST advanced examples for professionals that are provided by the TensorFlow website.

We will use MNIST dataset to create Softmax regression and CNN classification models to evaluate how well the models predict numbers using image data.

Each concept or term is described only at the required level to understand the examples. For an accurate understanding, you need to study additionally on machine learning and deep learning.

MNIST dataset

MNIST dataset consists of an image that represents a handwritten numeric image as a vector and labels that indicate what the image means. The labels in the image below are 5, 0, 4, and 1 respectively, and labels are made up of 10 unique values from 0 to 9.

MNIST dataset also contains 55,000 pieces of learning data (mnist.train), 10,000 pieces of test data (mnist.test), and 5,000 pieces of validation data (mnist.validation), each of which is further divided into images and labels described above.

Since an image consists of 28x28 (=784) pixels, it is stored as a vector of 784 dimensions, and the 784 dimensions contain values between 0 and 1 depending on the degree of enhancement.

Use the code below to download the data provided by TensorFlow and save it in the data folder.
Use the “one_hot=True” option (one hot encoding) to define a label as a 10-dimensional vector rather than defining one numeric value between 0 and 9. One hot encoding data will be described again in the example below.

""" Import TensorFlow package: You can use tf hereafter. """
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt


""" Download and load data 
Download the data provided by TensorFlow, save it in the data folder and load it. 
The data is downloaded only at first, and after that, the saved data is loaded without being downloaded, in less time.""" 
from tensorflow.examples.tutorials.mnist import input_data
You can keep track of the whole execution time through %time mnist = input_data.read_data_sets("data/", one_hot=True)  # %time.

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/t10k-labels-idx1-ubyte.gz
CPU times: user 447 ms, sys: 454 ms, total: 901 ms
Wall time: 36.1 s

As the images are shown as a vector of 784 dimensions and the labels read data using the “one_hot=True” option (one hot encoding), the label, “7” is represented as '[ 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]' as shown in the code below. (0 is represented as [ 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.], 1 as [ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.], and 2 as [ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]).

# Check images/labels data structure 
print 'train dataset (55,000):', mnist.train.images.shape, mnist.train.labels.shape
print 'test dataset (10,000):', mnist.test.images.shape, mnist.test.labels.shape
print 'validation dataset (5,000):', mnist.validation.images.shape, mnist.validation.labels.shape

Check the data sample (first image data 7). 
print '\nlabel :', mnist.train.labels[0]
label = np.argmax(mnist.train.labels[0])  # The largested value (where 1 exists)

im = np.reshape(mnist.train.images[0], [28,28])
plt.imshow(im, cmap='Greys')
plt.title('label:' + str(label))
plt.show()

train dataset (55,000): (55000, 784) (55000, 10)
test dataset (10,000): (10000, 784) (10000, 10)
validation dataset (5,000): (5000, 784) (5000, 10)

label: [ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]

Regression model

This example code covers MNIST basic samples provided by TensorFlow for TensorFlow beginners.
We will create a regression model and train it to predict a label and calculate the accuracy of the model.

Implement the regression

Define a placeholder to hold an image and a correct label, and Variable to hold the weight and bias, which are the learning results, and define a Softmax regression model.

""" Define placeholder: Where the data will be placed. 
Create a two-dimensional tensor for images and correct labels. 
None means no limits in length. """
# Placeholder for image data
x = tf.placeholder(tf.float32, [None, 784])
# Placeholder for a correct answer label
y_ = tf.placeholder(tf.float32, [None, 10])

""" Define Variable:  The weight and bias to store learning results """
# Initialize it to 0. 
W = tf.Variable(tf.zeros([784, 10])) # w is for multiplying a 784 dimensional image vector and yielding a result of 10 dimensions (one hot encoded 0 to 9).
b = tf.Variable(tf.zeros([10]))      # b is 10 dimensions to be added to the result.

""" Define model: Softmax regression 
Use Softmax to choose the value with the highest probability among 10 values. """
y = tf.nn.softmax(tf.matmul(x, W) + b)

Train the model

Define Loss function and Learning Rate required for model training, and sample 100 pieces of data at a time to have the model learn 1000 times.
Increasing the sampled data can enhance accuracy, but the learning time is also increased.
Stochastic Training is to learn with a small batch to be randomly sampled and is commonly used because it can produce similar results at a reasonable price.

""" Model training """
# Define Loss function
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
# Define a learning rate as 0.5.
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

# Initialize all variables before starting a session.
init = tf.global_variables_initializer()

sess = tf.Session()
sess.run(init)

# Sample every 100 pieces of data to perform learning 1000 times.
for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)  # Import the ‘batch’ consisting of 100 randomly sampled pieces of data from the learning dataset. 
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})    # Supply the batch_xs, batch_ys sampled in placeholder x, y_.

Evaluate the model

Find the label with the highest probability through tf.argmax.
Define the correct_prediction and accuracy tensor to get the same predicted value(y) and correct answer(y_) through tf.equal.

To evaluate the model, use test data to determine the accuracy.
Below, you have an accuracy of 0.9163, about 91%, and every time you re-train the model, the result may vary slightly.

""" Evaluate model """     
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))    
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Accuracy 
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.915

# Check classification results 
correct_vals = sess.run(correct_prediction, 
                        feed_dict={x: mnist.test.images, y_: mnist.test.labels})
pred_vals = sess.run(y, feed_dict={x: mnist.test.images} )

print 'Out of total test data', len(correct_vals), 'correct answers:', len(correct_vals[correct_vals == True]), \
      ', wrong answers:', len(correct_vals[correct_vals == False])
    
    
# Check only 3 correctly classified images 
fig = plt.figure(figsize=(10,3))
img_cnt = 0
for i, cv in enumerate(correct_vals):
    if cv==True:  # In case of correct classification 
        img_cnt +=1
        ax = fig.add_subplot(1,3,img_cnt)
        im = np.reshape(mnist.test.images[i], [28,28])
        label = np.argmax(mnist.test.labels[i])
        pred_label = np.argmax(pred_vals[i])
        ax.imshow(im, cmap='Greys')
        ax.text(2, 2, 'true label=' + str(label) + ', pred label=' + str(pred_label))
        
    if img_cnt == 3:  # Check only 3 images 
        break
plt.show()

Out of total test data 10000, correct answers: 9150 , wrong answers: 850

# Check only 3 wrongly classified images 
fig = plt.figure(figsize=(10,3))
img_cnt = 0
for i, cv in enumerate(correct_vals):
    if cv==False:  # In case of wrong classification 
        img_cnt +=1
        ax = fig.add_subplot(1,3,img_cnt)
        im = np.reshape(mnist.test.images[i], [28,28])
        label = np.argmax(mnist.test.labels[i])
        pred_label = np.argmax(pred_vals[i])
        ax.imshow(im, cmap='Greys')
        ax.text(2, 2, 'true label=' + str(label) + ', pred label=' + str(pred_label))
        
    if img_cnt == 3:  # Check only 3 images 
        break      
plt.show()

# Close the session when it’s done.
sess.close()

CNN model

This example code covers advanced samples for MNIST deep learning provided by TensorFlow for TensorFlow professionals.
Here, you will be trained by creating Convolutional Neural Network (CNN) model, which is a type of deep learning, to predict a label and calculate the accuracy of the model.

Initialize weight and bias

Initialize the weight by giving some noise to break the symmetry and prevent the gradient from going to zero.
Since ReLU neuron is used, initialize the bias to a small positive value of 0.1 to prevent it from becoming a dead neuron.

""" Initialize weight """
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

""" Initialize bias """ 
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

Define convolution and pooling

Set the stride of Convolution Layer to 1 through the code below and configure the padding to 0 to make the output size equal to the input value.
For Pooling, apply 2x2 sized max pooling and set the stride to 2.

""" Define convolution """
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

""" Define pooling """
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

Define convolutional layer

The code below shows how a 28x28 image is transformed into a 7x7 image passing through two convolutional layers.

That is, a 28x28 image will be 24x24 passing through the first convolutional layer (5x5 filter, Stride 1). It will become 28x28 after the first max pooling and top-bottom-left-right padding and turn into 14x14 through the stride 2.

Then, through the second convolutional layer (5x5 filter, Stride 1), it will be 10x10.
It will become 14x14 after the second max pooling and top-bottom-left-right padding and turn into 7x7 through the stride 2.

# Redefine input data with 4D tensor. 
# The second/third parameter specifies the width/height of the image.
# Since it is a monochrome image, the number of color channels of the last parameter is 1.
x_image = tf.reshape(x, [-1,28,28,1])

""" Define first convolutional layer """
# Weight tensor definition (patch size, patch size, input channel, output channel) 
# Use 32 features (kernel, filter) with a 5x5 window (also called patch) size.
# Since the image is monochrome, the input channel is 1.
W_conv1 = weight_variable([5, 5, 1, 32])  
# Define bias tensor 
b_conv1 = bias_variable([32])
# Apply convolution to the x_image and the weight tensor, add the bias, and then apply ReLU function.
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
# Apply max pooling to get an output value.
h_pool1 = max_pool_2x2(h_conv1)

""" Define second convolutional layer """
# Define weight tensor (patch size, patch size, input channel, output channel)
# Use 64 features with a 5x5 window (also called patch) size.
# The size of the output channel of the previous layer is 32, which is the input channel here. 
W_conv2 = weight_variable([5, 5, 32, 64]) 
# Define bias tensor 
b_conv2 = bias_variable([64])
# Apply convolution to the h_pool1 output value of the first convolutional layer and the weight tensor, add the bias, and then apply ReLU function.
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
# Apply max pooling to get an output value.
h_pool2 = max_pool_2x2(h_conv2)

""" Define fully-connected layer """
#  64 filters of 7×7 size.  The number of arbitrarily selected neurons (1024 here).
W_fc1 = weight_variable([7 * 7 * 64, 1024])  
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

""" Define dropout """
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

""" Define final softmax hierarchy """
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

Train and evaluate the model

Using the code below to train your model and evaluate its accuracy, you can see that it is more accurate than the first regression model.

Connect to [Public IP address:18888] on your web browser to see TensorBoard.
If the connection fails, connect to the server using a terminal and execute “jup tb-start” to start the TensorBoard process (refer to “Manage TensorBoard process”).

# Model training and evaluation 
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Sample every 100 pieces of data to perform learning 2000 times.
    for i in range(2000):
        batch = mnist.train.next_batch(100)
        if i % 100 == 0:
            train_accuracy = accuracy.eval(feed_dict={
                x: batch[0], y_: batch[1], keep_prob: 1.0})
            print 'step %d, training accuracy %g' % (i, train_accuracy)
        train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

    print 'test accuracy %g' % accuracy.eval(feed_dict={
        x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})

step 0, training accuracy 0.17
step 100, training accuracy 0.81
step 200, training accuracy 0.93
step 300, training accuracy 0.89
step 400, training accuracy 0.92
step 500, training accuracy 0.95
step 600, training accuracy 0.99
step 700, training accuracy 0.98
step 800, training accuracy 0.93
step 900, training accuracy 0.94
step 1000, training accuracy 0.97
step 1100, training accuracy 0.95
step 1200, training accuracy 0.97
step 1300, training accuracy 0.99
step 1400, training accuracy 0.95
step 1500, training accuracy 0.98
step 1600, training accuracy 0.97
step 1700, training accuracy 0.96
step 1800, training accuracy 0.98
step 1900, training accuracy 0.97
test accuracy 0.9795

Was this article helpful?

What's Next

Tensorflow for beginner

Table of contents