Image Classification

Classify Images and wether or not they have pizza or steak in the photo.
This data will be sourced from the Food-101 dataset, particularly a subdivision of the images that only include pizzas and steaks.

Notebook Goals

  • use a pre-built set of images from the web
  • build & experiment with machine-learning models
  • compare binary-classification against CNN
  • Address Over-fitting by utilizing
    • MaxPool layers to reduce the number of "features" that the model is dealing with
    • data augmentation to train the model with images that are "imperfect" to mimic real-world imperfect photos
    • shuffling training data: reduce the chances of "learning" made by the order of input data

References

Pre-Built Image-recognition Models

Check out some pre-built models for image recognition:

  • ImageNet: seems to be a giant well-built & well-used image-recognition model.
  • ResNet50
In [1]:
import zipfile
import os
import pathlib
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import random
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPool2D, Activation
from tensorflow.keras import Sequential
import pandas as pd

Download & Inspect Data

In [2]:
# Download zip file of pizza_steak images
fileName = 'pizza_steak.zip'
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip

# Unzip the downloaded file
zip_ref = zipfile.ZipFile(fileName, "r")
zip_ref.extractall()
zip_ref.close()
--2024-06-18 17:25:17--  https://storage.googleapis.com/ztm_tf_course/food_vision/pizza_steak.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 142.251.40.155, 142.251.40.187, 142.250.65.187, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.251.40.155|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 109540975 (104M) [application/zip]
Saving to: ‘pizza_steak.zip.2’

pizza_steak.zip.2   100%[===================>] 104.47M  11.1MB/s    in 9.5s    

2024-06-18 17:25:27 (11.0 MB/s) - ‘pizza_steak.zip.2’ saved [109540975/109540975]

Inspect The Data

The data is in a dir, pizza_steak.
The dir has 2 subdirs, test and train.
Each subdir has 2 subdirs, pizza and steak.
Each pizza and steak dir has images included.

In [3]:
!ls pizza_steak/train/steak
1000205.jpg  1647351.jpg  2238681.jpg  2824680.jpg  3375959.jpg  417368.jpg
100135.jpg   1650002.jpg  2238802.jpg  2825100.jpg  3381560.jpg  4176.jpg
101312.jpg   165639.jpg   2254705.jpg  2826987.jpg  3382936.jpg  42125.jpg
1021458.jpg  1658186.jpg  225990.jpg   2832499.jpg  3386119.jpg  421476.jpg
1032846.jpg  1658443.jpg  2260231.jpg  2832960.jpg  3388717.jpg  421561.jpg
10380.jpg    165964.jpg   2268692.jpg  285045.jpg   3389138.jpg  438871.jpg
1049459.jpg  167069.jpg   2271133.jpg  285147.jpg   3393547.jpg  43924.jpg
1053665.jpg  1675632.jpg  227576.jpg   2855315.jpg  3393688.jpg  440188.jpg
1068516.jpg  1678108.jpg  2283057.jpg  2856066.jpg  3396589.jpg  442757.jpg
1068975.jpg  168006.jpg   2286639.jpg  2859933.jpg  339891.jpg	 443210.jpg
1081258.jpg  1682496.jpg  2287136.jpg  286219.jpg   3417789.jpg  444064.jpg
1090122.jpg  1684438.jpg  2291292.jpg  2862562.jpg  3425047.jpg  444709.jpg
1093966.jpg  168775.jpg   229323.jpg   2865730.jpg  3434983.jpg  447557.jpg
1098844.jpg  1697339.jpg  2300534.jpg  2878151.jpg  3435358.jpg  461187.jpg
1100074.jpg  1710569.jpg  2300845.jpg  2880035.jpg  3438319.jpg  461689.jpg
1105280.jpg  1714605.jpg  231296.jpg   2881783.jpg  3444407.jpg  465494.jpg
1117936.jpg  1724387.jpg  2315295.jpg  2884233.jpg  345734.jpg	 468384.jpg
1126126.jpg  1724717.jpg  2323132.jpg  2890573.jpg  3460673.jpg  477486.jpg
114601.jpg   172936.jpg   2324994.jpg  2893832.jpg  3465327.jpg  482022.jpg
1147047.jpg  1736543.jpg  2327701.jpg  2893892.jpg  3466159.jpg  482465.jpg
1147883.jpg  1736968.jpg  2331076.jpg  2907177.jpg  3469024.jpg  483788.jpg
1155665.jpg  1746626.jpg  233964.jpg   290850.jpg   3470083.jpg  493029.jpg
1163977.jpg  1752330.jpg  2344227.jpg  2909031.jpg  3476564.jpg  503589.jpg
1190233.jpg  1761285.jpg  234626.jpg   2910418.jpg  3478318.jpg  510757.jpg
1208405.jpg  176508.jpg   234704.jpg   2912290.jpg  3488748.jpg  513129.jpg
1209120.jpg  1772039.jpg  2357281.jpg  2916448.jpg  3492328.jpg  513842.jpg
1212161.jpg  1777107.jpg  2361812.jpg  2916967.jpg  3518960.jpg  523535.jpg
1213988.jpg  1787505.jpg  2365287.jpg  2927833.jpg  3522209.jpg  525041.jpg
1219039.jpg  179293.jpg   2374582.jpg  2928643.jpg  3524429.jpg  534560.jpg
1225762.jpg  1816235.jpg  239025.jpg   2929179.jpg  3528458.jpg  534633.jpg
1230968.jpg  1822407.jpg  2390628.jpg  2936477.jpg  3531805.jpg  536535.jpg
1236155.jpg  1823263.jpg  2392910.jpg  2938012.jpg  3536023.jpg  541410.jpg
1241193.jpg  1826066.jpg  2394465.jpg  2938151.jpg  3538682.jpg  543691.jpg
1248337.jpg  1828502.jpg  2395127.jpg  2939678.jpg  3540750.jpg  560503.jpg
1257104.jpg  1828969.jpg  2396291.jpg  2940544.jpg  354329.jpg	 561972.jpg
126345.jpg   1829045.jpg  2400975.jpg  2940621.jpg  3547166.jpg  56240.jpg
1264050.jpg  1829088.jpg  2403776.jpg  2949079.jpg  3553911.jpg  56409.jpg
1264154.jpg  1836332.jpg  2403907.jpg  295491.jpg   3556871.jpg  564530.jpg
1264858.jpg  1839025.jpg  240435.jpg   296268.jpg   355715.jpg	 568972.jpg
127029.jpg   1839481.jpg  2404695.jpg  2964732.jpg  356234.jpg	 576725.jpg
1289900.jpg  183995.jpg   2404884.jpg  2965021.jpg  3571963.jpg  588739.jpg
1290362.jpg  184110.jpg   2407770.jpg  2966859.jpg  3576078.jpg  590142.jpg
1295457.jpg  184226.jpg   2412263.jpg  2977966.jpg  3577618.jpg  60633.jpg
1312841.jpg  1846706.jpg  2425062.jpg  2979061.jpg  3577732.jpg  60655.jpg
1313316.jpg  1849364.jpg  2425389.jpg  2983260.jpg  3578934.jpg  606820.jpg
1324791.jpg  1849463.jpg  2435316.jpg  2984311.jpg  358042.jpg	 612551.jpg
1327567.jpg  1849542.jpg  2437268.jpg  2988960.jpg  358045.jpg	 614975.jpg
1327667.jpg  1853564.jpg  2437843.jpg  2989882.jpg  3591821.jpg  616809.jpg
1333055.jpg  1869467.jpg  2440131.jpg  2995169.jpg  359330.jpg	 628628.jpg
1334054.jpg  1870942.jpg  2443168.jpg  2996324.jpg  3601483.jpg  632427.jpg
1335556.jpg  187303.jpg   2446660.jpg  3000131.jpg  3606642.jpg  636594.jpg
1337814.jpg  187521.jpg   2455944.jpg  3002350.jpg  3609394.jpg  637374.jpg
1340977.jpg  1888450.jpg  2458401.jpg  3007772.jpg  361067.jpg	 640539.jpg
1343209.jpg  1889336.jpg  2487306.jpg  3008192.jpg  3613455.jpg  644777.jpg
134369.jpg   1907039.jpg  248841.jpg   3009617.jpg  3621464.jpg  644867.jpg
1344105.jpg  1925230.jpg  2489716.jpg  3011642.jpg  3621562.jpg  658189.jpg
134598.jpg   1927984.jpg  2490489.jpg  3020591.jpg  3621565.jpg  660900.jpg
1346387.jpg  1930577.jpg  2495884.jpg  3030578.jpg  3623556.jpg  663014.jpg
1348047.jpg  1937872.jpg  2495903.jpg  3047807.jpg  3640915.jpg  664545.jpg
1351372.jpg  1941807.jpg  2499364.jpg  3059843.jpg  3643951.jpg  667075.jpg
1362989.jpg  1942333.jpg  2500292.jpg  3074367.jpg  3653129.jpg  669180.jpg
1367035.jpg  1945132.jpg  2509017.jpg  3082120.jpg  3656752.jpg  669960.jpg
1371177.jpg  1961025.jpg  250978.jpg   3094354.jpg  3663518.jpg  6709.jpg
1375640.jpg  1966300.jpg  2514432.jpg  3095301.jpg  3663800.jpg  674001.jpg
1382427.jpg  1966967.jpg  2526838.jpg  3099645.jpg  3664376.jpg  676189.jpg
1392718.jpg  1969596.jpg  252858.jpg   3100476.jpg  3670607.jpg  681609.jpg
1395906.jpg  1971757.jpg  2532239.jpg  3110387.jpg  3671021.jpg  6926.jpg
1400760.jpg  1976160.jpg  2534567.jpg  3113772.jpg  3671877.jpg  703556.jpg
1403005.jpg  1984271.jpg  2535431.jpg  3116018.jpg  368073.jpg	 703909.jpg
1404770.jpg  1987213.jpg  2535456.jpg  3128952.jpg  368162.jpg	 704316.jpg
140832.jpg   1987639.jpg  2538000.jpg  3130412.jpg  368170.jpg	 714298.jpg
141056.jpg   1995118.jpg  2543081.jpg  3136.jpg     3693649.jpg  720060.jpg
141135.jpg   1995252.jpg  2544643.jpg  313851.jpg   3700079.jpg  726083.jpg
1413972.jpg  199754.jpg   2547797.jpg  3140083.jpg  3704103.jpg  728020.jpg
1421393.jpg  2002400.jpg  2548974.jpg  3140147.jpg  3707493.jpg  732986.jpg
1428947.jpg  2011264.jpg  2549316.jpg  3142045.jpg  3716881.jpg  734445.jpg
1433912.jpg  2012996.jpg  2561199.jpg  3142618.jpg  3724677.jpg  735441.jpg
143490.jpg   2013535.jpg  2563233.jpg  3142674.jpg  3727036.jpg  740090.jpg
1445352.jpg  2017387.jpg  256592.jpg   3143192.jpg  3727491.jpg  745189.jpg
1446401.jpg  2018173.jpg  2568848.jpg  314359.jpg   3736065.jpg  752203.jpg
1453991.jpg  2020613.jpg  2573392.jpg  3157832.jpg  37384.jpg	 75537.jpg
1456841.jpg  2032669.jpg  2592401.jpg  3159818.jpg  3743286.jpg  756655.jpg
146833.jpg   203450.jpg   2599817.jpg  3162376.jpg  3745515.jpg  762210.jpg
1476404.jpg  2034628.jpg  2603058.jpg  3168620.jpg  3750472.jpg  763690.jpg
1485083.jpg  2036920.jpg  2606444.jpg  3171085.jpg  3752362.jpg  767442.jpg
1487113.jpg  2038418.jpg  2614189.jpg  317206.jpg   3766099.jpg  786409.jpg
148916.jpg   2042975.jpg  2614649.jpg  3173444.jpg  3770370.jpg  80215.jpg
149087.jpg   2045647.jpg  2615718.jpg  3180182.jpg  377190.jpg	 802348.jpg
1493169.jpg  2050584.jpg  2619625.jpg  31881.jpg    3777020.jpg  804684.jpg
149682.jpg   2052542.jpg  2622140.jpg  3191589.jpg  3777482.jpg  812163.jpg
1508094.jpg  2056627.jpg  262321.jpg   3204977.jpg  3781152.jpg  813486.jpg
1512226.jpg  2062248.jpg  2625330.jpg  320658.jpg   3787809.jpg  819027.jpg
1512347.jpg  2081995.jpg  2628106.jpg  3209173.jpg  3788729.jpg  822550.jpg
1524526.jpg  2087958.jpg  2629750.jpg  3223400.jpg  3790962.jpg  823766.jpg
1530833.jpg  2088030.jpg  2643906.jpg  3223601.jpg  3792514.jpg  827764.jpg
1539499.jpg  2088195.jpg  2644457.jpg  3241894.jpg  379737.jpg	 830007.jpg
1541672.jpg  2090493.jpg  2648423.jpg  3245533.jpg  3807440.jpg  838344.jpg
1548239.jpg  2090504.jpg  2651300.jpg  3245622.jpg  381162.jpg	 853327.jpg
1550997.jpg  2125877.jpg  2653594.jpg  3247009.jpg  3812039.jpg  854150.jpg
1552530.jpg  2129685.jpg  2661577.jpg  3253588.jpg  3829392.jpg  864997.jpg
15580.jpg    2133717.jpg  2668916.jpg  3260624.jpg  3830872.jpg  885571.jpg
1559052.jpg  2136662.jpg  268444.jpg   326587.jpg   38442.jpg	 907107.jpg
1563266.jpg  213765.jpg   2691461.jpg  32693.jpg    3855584.jpg  908261.jpg
1567554.jpg  2138335.jpg  2706403.jpg  3271253.jpg  3857508.jpg  910672.jpg
1575322.jpg  2140776.jpg  270687.jpg   3274423.jpg  386335.jpg	 911803.jpg
1588879.jpg  214320.jpg   2707522.jpg  3280453.jpg  3867460.jpg  91432.jpg
1594719.jpg  2146963.jpg  2711806.jpg  3298495.jpg  3868959.jpg  914570.jpg
1595869.jpg  215222.jpg   2716993.jpg  330182.jpg   3869679.jpg  922752.jpg
1598345.jpg  2154126.jpg  2724554.jpg  3306627.jpg  388776.jpg	 923772.jpg
1598885.jpg  2154779.jpg  2738227.jpg  3315727.jpg  3890465.jpg  926414.jpg
1600179.jpg  2159975.jpg  2748917.jpg  331860.jpg   3894222.jpg  931356.jpg
1600794.jpg  2163079.jpg  2760475.jpg  332232.jpg   3895825.jpg  937133.jpg
160552.jpg   217250.jpg   2761427.jpg  3322909.jpg  389739.jpg	 945791.jpg
1606596.jpg  2172600.jpg  2765887.jpg  332557.jpg   3916407.jpg  947877.jpg
1615395.jpg  2173084.jpg  2768451.jpg  3326734.jpg  393349.jpg	 952407.jpg
1618011.jpg  217996.jpg   2771149.jpg  3330642.jpg  393494.jpg	 952437.jpg
1619357.jpg  2193684.jpg  2779040.jpg  3333128.jpg  398288.jpg	 955466.jpg
1621763.jpg  220341.jpg   2788312.jpg  3333735.jpg  40094.jpg	 9555.jpg
1623325.jpg  22080.jpg	  2788759.jpg  3334973.jpg  401094.jpg	 961341.jpg
1624450.jpg  2216146.jpg  2796102.jpg  3335013.jpg  401144.jpg	 97656.jpg
1624747.jpg  2222018.jpg  280284.jpg   3335267.jpg  401651.jpg	 979110.jpg
1628861.jpg  2223787.jpg  2807888.jpg  3346787.jpg  405173.jpg	 980247.jpg
1632774.jpg  2230959.jpg  2815172.jpg  3364420.jpg  405794.jpg	 982988.jpg
1636831.jpg  2232310.jpg  2818805.jpg  336637.jpg   40762.jpg	 987732.jpg
1645470.jpg  2233395.jpg  2823872.jpg  3372616.jpg  413325.jpg	 996684.jpg
In [4]:
# 
# SUMMARY OF DATA
# 

parentDir = 'pizza_steak'
# Walk through pizza_steak directory and list number of files
for dirpath, dirnames, filenames in os.walk(parentDir):
  if(len(filenames) > 0):
      print(f"    {len(filenames)} images in '{dirpath}'.")
  else:
      print(f"DIR: '{dirpath}' has {len(dirnames)} dirs")
DIR: 'pizza_steak' has 2 dirs
DIR: 'pizza_steak/train' has 2 dirs
    750 images in 'pizza_steak/train/pizza'.
    750 images in 'pizza_steak/train/steak'.
DIR: 'pizza_steak/test' has 2 dirs
    250 images in 'pizza_steak/test/pizza'.
    250 images in 'pizza_steak/test/steak'.
In [5]:
# 
# GET CLASS NAMES
# 
cleanPath = f'{parentDir}/train/'
trainingPath = pathlib.Path(cleanPath)
classNames = sorted([item.name for item in trainingPath.glob('*')])
npClassNames = np.array(classNames) # created a list of class_names from the subdirectories
print(npClassNames)
['pizza' 'steak']

Preview Some Images

In [6]:
def view_random_image(target_dir, target_class):
  # Setup target directory (we'll view images from here)
  target_folder = target_dir+target_class

  # Get a random image path
  random_image = random.sample(os.listdir(target_folder), 1)

  # Read in the image and plot it using matplotlib
  img = mpimg.imread(target_folder + "/" + random_image[0])
  plt.imshow(img)
  plt.title(target_class)
  plt.axis("off");

  print(f"Image shape: {img.shape}") # show the shape of the image

  return img
In [7]:
img = view_random_image(target_dir=cleanPath,
                        target_class="steak")
Image shape: (512, 512, 3)
output png
In [8]:
img
Out [8]:
array([[[115,  82,  75],
        [118,  85,  78],
        [117,  84,  75],
        ...,
        [ 41,  57,  72],
        [ 42,  59,  75],
        [ 45,  62,  78]],

       [[105,  72,  65],
        [110,  77,  70],
        [113,  80,  71],
        ...,
        [ 42,  58,  73],
        [ 39,  56,  72],
        [ 40,  57,  73]],

       [[106,  73,  66],
        [112,  79,  72],
        [114,  81,  74],
        ...,
        [ 42,  58,  73],
        [ 40,  58,  72],
        [ 40,  58,  72]],

       ...,

       [[ 95, 105, 107],
        [ 91, 101, 103],
        [ 89,  99, 101],
        ...,
        [ 44,  40,  37],
        [ 43,  39,  36],
        [ 45,  41,  38]],

       [[ 95, 105, 107],
        [ 88,  98, 100],
        [ 85,  95,  97],
        ...,
        [ 42,  38,  35],
        [ 44,  40,  37],
        [ 49,  45,  42]],

       [[ 90, 100, 102],
        [ 83,  93,  95],
        [ 79,  89,  91],
        ...,
        [ 41,  37,  34],
        [ 42,  38,  35],
        [ 48,  44,  41]]], dtype=uint8)
In [9]:
print(f'img shape: {img.shape}')
img shape: (512, 512, 3)

Key Points

  • the data is a bunch of images
  • the images are split into directories: test & train, then by classification (2 classifications)
  • the shape of the images are 512x512 with a 3-color representation per pixel (probably rgb)
    • it has become common to reshape the images to fit a 224x224 size
    • the rgb values
      • fit between 0-255, where 0 is "black" and 255 is "white"
      • 1st digit is red
      • 2nd digit is green
      • 3rd digit is blue

Build A Model: CNN

About Convolutional Neural Networks

Parts of a CNN:

  • input (images)
  • LAYERS & related details
    • input layer: batch_size, img dimensions, classification mode
    • convolution layer: figures out "the most important" patterns to learn, Conv2D
    • hidden activation: add "non-linearity" to learned features, most typically relu
    • pooling layer: reduces diemsions of learned features, AvgPool2d, MaxPool2D
    • "fully connected" layer: , a "last step" aggregating / refining the convolution layers Dense
    • output layer: fits to the desired number of "classes" to learn
    • output activation add non-linearity to the output layer, sigmoid or softmax

A typical CNN structure: Input -> Conv + ReLU layers (non-linearities) -> Pooling layer -> Fully connected (dense layer) as Output

Prep the model Data

  • get data into "training" and "testing" datasets
  • "batch" the data: sub-sets of data to minimize the data loaded into memory (GPU or CPU) at once
  • scale the image data-values to be between 0-to-1
In [10]:
# Set the seed
tf.random.set_seed(42)
imgW = 224
imgH = 224
maxScaleNumber = 255

# 
# batch_size: limits amount of data in memory at once (in batches!)
# 32 has become a regular starting place in the machine-learning world
# 
imagesInABatch = 32

# Preprocess data (get all of the pixel values between 1 and 0, also called scaling/normalization)
# rescaling normalizes 0-255 values to 0-1
# ImageDataGenerator DOCS (a lot there)
# https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./maxScaleNumber)
valid_datagen = ImageDataGenerator(rescale=1./maxScaleNumber)

# Setup the train and test directories
train_dir = "pizza_steak/train/"
test_dir = "pizza_steak/test/"

# 
# BATCH the data
# 
train_data = train_datagen.flow_from_directory(train_dir,
                                               batch_size=imagesInABatch, # number of images to process at a time 
                                               target_size=(imgW, imgH), # convert all images to be 224 x 224
                                               class_mode="binary", # type of problem we're working on
                                               seed=42)

valid_data = valid_datagen.flow_from_directory(test_dir,
                                               batch_size=imagesInABatch,
                                               target_size=(imgW, imgH),
                                               class_mode="binary",
                                               seed=42)
# for later modeling
test_data = valid_datagen.flow_from_directory(test_dir,
                                               batch_size=imagesInABatch,
                                               target_size=(imgW, imgH),
                                               class_mode="binary",
                                               seed=42)
Found 1500 images belonging to 2 classes.
Found 500 images belonging to 2 classes.
Found 500 images belonging to 2 classes.
In [11]:
print(f'how many items in train_data? {len(train_data)}')
print(f'how many items in the firt element of train_data? {len(train_data[0])}')
how many items in train_data? 47
how many items in the firt element of train_data? 2

Inspect some training data

In [12]:
# Get a sample of the training data batch 
images, labels = train_data.next() # get the 'next' batch of images/labels
print(f'images:{len(images)}, labels:{len(labels)}')

# NOTICE LABELS:
# 0 or 1
print('labels:')
labels
images:32, labels:32
labels:
Out [12]:
array([0., 0., 1., 0., 0., 0., 0., 1., 1., 1., 0., 1., 0., 1., 0., 1., 1.,
       1., 1., 1., 0., 0., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0.],
      dtype=float32)

Build the Model

In [13]:
# Create a CNN model (same as Tiny VGG - https://poloclub.github.io/cnn-explainer/)
imageW = 224
imageH = 224
imageColorCount = 3
convoFilterCount = 10
convoKernelCount = 3
maxPoolSize = 2

m1 = tf.keras.models.Sequential([
  Conv2D(filters=convoFilterCount, 
                         kernel_size=convoKernelCount, # can also be (3, 3)
                         activation="relu", 
                         input_shape=(imageW, imageH, imageColorCount)), # first layer specifies input shape (height, width, colour channels)
  Conv2D(convoFilterCount, convoKernelCount, activation="relu"),
  MaxPool2D(pool_size=maxPoolSize, # pool_size can also be (2, 2)
                            padding="valid"), # padding can also be 'same'
  Conv2D(convoFilterCount, convoKernelCount, activation="relu"),
  Conv2D(convoFilterCount, convoKernelCount, activation="relu"), # activation='relu' == Activations(tf.nn.relu)
  MaxPool2D(maxPoolSize),
  Flatten(),
  Dense(1, activation="sigmoid") # binary activation output
])

# Compile the model
m1.compile(loss="binary_crossentropy",
              optimizer=tf.keras.optimizers.Adam(),
              metrics=["accuracy"])

# Fit the model
m1History = m1.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=valid_data,
                        validation_steps=len(valid_data))
Epoch 1/5
47/47 [==============================] - 259s 5s/step - loss: 0.6137 - accuracy: 0.6607 - val_loss: 0.4632 - val_accuracy: 0.7880
Epoch 2/5
47/47 [==============================] - 242s 5s/step - loss: 0.4803 - accuracy: 0.7707 - val_loss: 0.4165 - val_accuracy: 0.8180
Epoch 3/5
47/47 [==============================] - 242s 5s/step - loss: 0.4407 - accuracy: 0.8133 - val_loss: 0.3898 - val_accuracy: 0.8500
Epoch 4/5
47/47 [==============================] - 242s 5s/step - loss: 0.3904 - accuracy: 0.8393 - val_loss: 0.3481 - val_accuracy: 0.8580
Epoch 5/5
47/47 [==============================] - 241s 5s/step - loss: 0.3140 - accuracy: 0.8733 - val_loss: 0.3499 - val_accuracy: 0.8540

Inspect The Model Results

In [14]:
# Check out model_3 architecture
m1.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 222, 222, 10)      280       
                                                                 
 conv2d_1 (Conv2D)           (None, 220, 220, 10)      910       
                                                                 
 max_pooling2d (MaxPooling2  (None, 110, 110, 10)      0         
 D)                                                              
                                                                 
 conv2d_2 (Conv2D)           (None, 108, 108, 10)      910       
                                                                 
 conv2d_3 (Conv2D)           (None, 106, 106, 10)      910       
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 53, 53, 10)        0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 28090)             0         
                                                                 
 dense (Dense)               (None, 1)                 28091     
                                                                 
=================================================================
Total params: 31101 (121.49 KB)
Trainable params: 31101 (121.49 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Build A Model II: Binary Classification

Binary classification models are significantly "simpler" than CNNs.
Let's see how a Binary CLassification model

Modeling steps

  • Familiarize with the data (visualize, visualize, visualize...)
  • "Preprocess" the data (prepare it for a model)
  • Create a model (start with a baseline)
    • Fit the model
  • Evaluate the model
    • Adjust different parameters and improve model (try to beat your baseline)
  • Repeat Evaluate & Adjust until satisfied
In [15]:
tf.random.set_seed(42)

# Create a model to replicate the TensorFlow Playground model
m2 = tf.keras.Sequential([
  Flatten(input_shape=(imgW, imgH, 3)), # dense layers expect a 1-dimensional vector as input
  Dense(4, activation='relu'),
  Dense(4, activation='relu'),
  Dense(1, activation='sigmoid')
])

# Compile the model
m2.compile(loss='binary_crossentropy',
              optimizer=tf.keras.optimizers.Adam(),
              metrics=["accuracy"])

# Fit the model
m2History = m2.fit(train_data, # use same training data created above
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=valid_data, # use same validation data created above
                        validation_steps=len(valid_data))
Epoch 1/5
47/47 [==============================] - 20s 369ms/step - loss: 0.9434 - accuracy: 0.4893 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 2/5
47/47 [==============================] - 16s 350ms/step - loss: 0.6932 - accuracy: 0.5000 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 3/5
47/47 [==============================] - 17s 357ms/step - loss: 0.6932 - accuracy: 0.5000 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 4/5
47/47 [==============================] - 17s 351ms/step - loss: 0.6932 - accuracy: 0.5000 - val_loss: 0.6931 - val_accuracy: 0.5000
Epoch 5/5
47/47 [==============================] - 16s 349ms/step - loss: 0.6932 - accuracy: 0.5000 - val_loss: 0.6931 - val_accuracy: 0.5000

Inspect the Model

In [16]:
m1.summary()
m2.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 222, 222, 10)      280       
                                                                 
 conv2d_1 (Conv2D)           (None, 220, 220, 10)      910       
                                                                 
 max_pooling2d (MaxPooling2  (None, 110, 110, 10)      0         
 D)                                                              
                                                                 
 conv2d_2 (Conv2D)           (None, 108, 108, 10)      910       
                                                                 
 conv2d_3 (Conv2D)           (None, 106, 106, 10)      910       
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 53, 53, 10)        0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 28090)             0         
                                                                 
 dense (Dense)               (None, 1)                 28091     
                                                                 
=================================================================
Total params: 31101 (121.49 KB)
Trainable params: 31101 (121.49 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten_1 (Flatten)         (None, 150528)            0         
                                                                 
 dense_1 (Dense)             (None, 4)                 602116    
                                                                 
 dense_2 (Dense)             (None, 4)                 20        
                                                                 
 dense_3 (Dense)             (None, 1)                 5         
                                                                 
=================================================================
Total params: 602141 (2.30 MB)
Trainable params: 602141 (2.30 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Build A Model III: Binary Adjusted

In [17]:
# Set random seed
tf.random.set_seed(42)

# Create a model similar to model_1 but add an extra layer and increase the number of hidden units in each layer
m3 = tf.keras.Sequential([
  tf.keras.layers.Flatten(input_shape=(imgW, imgH, 3)), # dense layers expect a 1-dimensional vector as input
  tf.keras.layers.Dense(100, activation='relu'), # increase number of neurons from 4 to 100 (for each layer)
  tf.keras.layers.Dense(100, activation='relu'),
  tf.keras.layers.Dense(100, activation='relu'), # add an extra layer
  tf.keras.layers.Dense(1, activation='sigmoid')
])

# Compile the model
m3.compile(loss='binary_crossentropy',
              optimizer=tf.keras.optimizers.Adam(),
              metrics=["accuracy"])

# Fit the model
m3History = m3.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=valid_data,
                        validation_steps=len(valid_data))
Epoch 1/5
47/47 [==============================] - 59s 1s/step - loss: 4.3189 - accuracy: 0.6033 - val_loss: 1.2869 - val_accuracy: 0.6940
Epoch 2/5
47/47 [==============================] - 56s 1s/step - loss: 0.8009 - accuracy: 0.7120 - val_loss: 0.8136 - val_accuracy: 0.6260
Epoch 3/5
47/47 [==============================] - 56s 1s/step - loss: 1.0083 - accuracy: 0.6787 - val_loss: 0.4765 - val_accuracy: 0.7580
Epoch 4/5
47/47 [==============================] - 55s 1s/step - loss: 0.5737 - accuracy: 0.7447 - val_loss: 2.3730 - val_accuracy: 0.5220
Epoch 5/5
47/47 [==============================] - 55s 1s/step - loss: 0.6828 - accuracy: 0.7333 - val_loss: 1.2816 - val_accuracy: 0.5580

Inspect The Model

In [18]:
m2.summary()
m3.summary()
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten_1 (Flatten)         (None, 150528)            0         
                                                                 
 dense_1 (Dense)             (None, 4)                 602116    
                                                                 
 dense_2 (Dense)             (None, 4)                 20        
                                                                 
 dense_3 (Dense)             (None, 1)                 5         
                                                                 
=================================================================
Total params: 602141 (2.30 MB)
Trainable params: 602141 (2.30 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten_2 (Flatten)         (None, 150528)            0         
                                                                 
 dense_4 (Dense)             (None, 100)               15052900  
                                                                 
 dense_5 (Dense)             (None, 100)               10100     
                                                                 
 dense_6 (Dense)             (None, 100)               10100     
                                                                 
 dense_7 (Dense)             (None, 1)                 101       
                                                                 
=================================================================
Total params: 15073201 (57.50 MB)
Trainable params: 15073201 (57.50 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Interestingly

  • the binary classification models have WAY MORE PARAMETERS than the CNN

Build A Model: CNN "Baseline"

Setup a "simple" model to start with:

  • 2D layers refer to the data have 2 "dimensions": height + width (color is a data attribute of each height/width pixel)
  • filters: the number of "feature extractions", or "filters", that get "passed over" input tensors (10,32,64,128). The higher the number, the more complex the model.
  • kernel_size is describes the shape of a grid of pixels of the filter. The smaller, the more "fine-grained" the feature detection / filter will be
  • stride: describes the movement of a kernel across the image (in pixels-per-stride)
  • padding: to cut-off or not pixels when the filter may not cover pixels. a 224w image with a 3x3 filter will leave a few pixels un"filtered", as a 3px-wide filter will cover 222 pixels by moving 74x
  • features in cnn are "significant" parts of images that the CNN has figured out

Build

In [19]:
m4 = Sequential([
  Conv2D(filters=10, 
         kernel_size=3, 
         strides=1,
         padding='valid',
         activation='relu', 
         input_shape=(224, 224, 3)), # input layer (specify input shape)
  Conv2D(10, 3, activation='relu'),
  Conv2D(10, 3, activation='relu'),
  Flatten(),
  Dense(1, activation='sigmoid') # output layer (specify output shape)
])

Compile

In [20]:
# Compile the model
m4.compile(loss='binary_crossentropy',
                optimizer=Adam(),
                metrics=['accuracy'])
In [21]:
# Fit the model
m4History = m4.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=test_data,
                        validation_steps=len(test_data))
Epoch 1/5
47/47 [==============================] - 304s 6s/step - loss: 3.1612 - accuracy: 0.5700 - val_loss: 0.5637 - val_accuracy: 0.6740
Epoch 2/5
47/47 [==============================] - 313s 7s/step - loss: 0.5346 - accuracy: 0.7207 - val_loss: 0.5108 - val_accuracy: 0.7260
Epoch 3/5
47/47 [==============================] - 292s 6s/step - loss: 0.4237 - accuracy: 0.8140 - val_loss: 0.4998 - val_accuracy: 0.7740
Epoch 4/5
47/47 [==============================] - 311s 7s/step - loss: 0.2787 - accuracy: 0.8960 - val_loss: 0.5315 - val_accuracy: 0.7560
Epoch 5/5
47/47 [==============================] - 309s 7s/step - loss: 0.1256 - accuracy: 0.9587 - val_loss: 0.6080 - val_accuracy: 0.7600

Inspect & Compare

In [33]:
# m2.summary()
# m3.summary()
m4.summary() 
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_4 (Conv2D)           (None, 222, 222, 10)      280       
                                                                 
 conv2d_5 (Conv2D)           (None, 220, 220, 10)      910       
                                                                 
 conv2d_6 (Conv2D)           (None, 218, 218, 10)      910       
                                                                 
 flatten_3 (Flatten)         (None, 475240)            0         
                                                                 
 dense_8 (Dense)             (None, 1)                 475241    
                                                                 
=================================================================
Total params: 477341 (1.82 MB)
Trainable params: 477341 (1.82 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Evaluate

Visualze CNN Model Stats

In [23]:
pd.DataFrame(m4History.history).plot(figsize=(10, 7));
output png
In [24]:
# Plot the validation and training data separately
def plot_loss_curves(history):
  """
  Returns separate loss curves for training and validation metrics.
  """ 
  loss = history.history['loss']
  val_loss = history.history['val_loss']

  accuracy = history.history['accuracy']
  val_accuracy = history.history['val_accuracy']

  epochs = range(len(history.history['loss']))

  # Plot loss
  plt.plot(epochs, loss, label='training_loss')
  plt.plot(epochs, val_loss, label='val_loss')
  plt.title('Loss')
  plt.xlabel('Epochs')
  plt.legend()

  # Plot accuracy
  plt.figure()
  plt.plot(epochs, accuracy, label='training_accuracy')
  plt.plot(epochs, val_accuracy, label='val_accuracy')
  plt.title('Accuracy')
  plt.xlabel('Epochs')
  plt.legend();
In [25]:
# Check out the loss curves of model_4
plot_loss_curves(m4History)
output png
output png

Beware Overfitting

Above, the val_loss goes UP after the 3rd epoch. INCREASING loss means over-fitting. Over-fitting is when the model gets excellent at predicting based on the data it was trained & tested with, BUT will loose the ability to predict NEW input as well.

Overfitting happens when...

  • a "large" number of convolutional layers is present
  • a "large" number of convolutional filters is present
  • the "shape" of the accuracy curve-over-epochs has changed from going up to flat &/or going down
  • the "shape" of the loss curve-over-epochs has "flattened out" from going down

Adjust The Model

  • build
  • overfit
  • reduce overfitting (by a few approaches):
    • adjust (reduce) number of convolutional layers
    • adjust number of convolutional filters
    • add dense layer to the output of the flattened layer

Here, we'll add a MaxPool2D layer after each convolutional layer.

Build

In [26]:
m5 = Sequential([
  # convo-then-maxPool
  Conv2D(10, 3, activation='relu', input_shape=(224, 224, 3)),
  MaxPool2D(pool_size=2), # reduce number of features by half
  # convo-then-maxPool
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  # convo-then-maxPool
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  Flatten(),
  Dense(1, activation='sigmoid')
])

Compile

In [27]:
m5.compile(loss='binary_crossentropy',
                optimizer=Adam(),
                metrics=['accuracy'])

Fit

In [28]:
# Fit the model
m5History = m5.fit(train_data,
                        epochs=5,
                        steps_per_epoch=len(train_data),
                        validation_data=test_data,
                        validation_steps=len(test_data))
Epoch 1/5
47/47 [==============================] - 96s 2s/step - loss: 0.6475 - accuracy: 0.6487 - val_loss: 0.4795 - val_accuracy: 0.7980
Epoch 2/5
47/47 [==============================] - 92s 2s/step - loss: 0.4838 - accuracy: 0.7800 - val_loss: 0.3946 - val_accuracy: 0.8260
Epoch 3/5
47/47 [==============================] - 92s 2s/step - loss: 0.4420 - accuracy: 0.8087 - val_loss: 0.3836 - val_accuracy: 0.8400
Epoch 4/5
47/47 [==============================] - 92s 2s/step - loss: 0.4308 - accuracy: 0.8167 - val_loss: 0.3643 - val_accuracy: 0.8640
Epoch 5/5
47/47 [==============================] - 91s 2s/step - loss: 0.4014 - accuracy: 0.8307 - val_loss: 0.3595 - val_accuracy: 0.8520

Evaluate

Plot The Loss & Accuracy Curves

See the "loss" as epochs increase. Expecting loss to go down while epochs go on. When the validation loss starts to increase, the model is probably over-fitting. See the "accuracy" as epochs increase. Expecting accuracy to increase as epochs go on.

In [32]:
plot_loss_curves(m5History)
output png
output png

View Model Stats

In [35]:
m5.summary()
Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_7 (Conv2D)           (None, 222, 222, 10)      280       
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 111, 111, 10)      0         
 g2D)                                                            
                                                                 
 conv2d_8 (Conv2D)           (None, 109, 109, 10)      910       
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 54, 54, 10)        0         
 g2D)                                                            
                                                                 
 conv2d_9 (Conv2D)           (None, 52, 52, 10)        910       
                                                                 
 max_pooling2d_4 (MaxPoolin  (None, 26, 26, 10)        0         
 g2D)                                                            
                                                                 
 flatten_4 (Flatten)         (None, 6760)              0         
                                                                 
 dense_9 (Dense)             (None, 1)                 6761      
                                                                 
=================================================================
Total params: 8861 (34.61 KB)
Trainable params: 8861 (34.61 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Build A Model: Data Augmentation

Data Augmentation

tf docs on data augmentation!

  • alter training data
  • give training data more "diversity", helping "generalize" patterns for the model to learn
  • i.e rotating, flipping, cropping, etc
  • help prevent over-fitting: force the model to "learn" from "imperfect" &/or "augmented" images, mimicing real-world "new" images that the model has not-yet seen
  • NOTE: testing will be done on (regular) non-augmented images

Prep Data

In [44]:
# Create ImageDataGenerator training instance with data augmentation
train_datagen_augmented = ImageDataGenerator(rescale=1/255.,
                                             rotation_range=20, # rotate the image slightly between 0 and 20 degrees (note: this is an int not a float)
                                             shear_range=0.2, 
                                             zoom_range=0.2, 
                                             width_shift_range=0.2, 
                                             height_shift_range=0.2, 
                                             horizontal_flip=True)

# 
# will re-use "train_datagen" from above
# 
# Create ImageDataGenerator test instance without data augmentation
test_datagen = ImageDataGenerator(rescale=1/255.)
print("Augmented training images")
train_data_augmented = train_datagen_augmented.flow_from_directory(train_dir,
                                                                   target_size=(224, 224),
                                                                   batch_size=32,
                                                                   class_mode='binary',
                                                                   shuffle=False)

print("Non-augmented training images:")
augmented_train_data = train_datagen.flow_from_directory(train_dir,
                                               target_size=(224, 224),
                                               batch_size=32,
                                               class_mode='binary',
                                               shuffle=False) # Don't shuffle for demonstration purposes

print("Unchanged test images:")
augmented_test_data = test_datagen.flow_from_directory(test_dir,
                                             target_size=(224, 224),
                                             batch_size=32,
                                             class_mode='binary')
Augmented training images
Found 1500 images belonging to 2 classes.
Non-augmented training images:
Found 1500 images belonging to 2 classes.
Unchanged test images:
Found 500 images belonging to 2 classes.

Preview Some "augmented" Images

In [45]:
# get data to preview
images, labels = augmented_train_data.next()
augmented_images, augmented_labels = train_data_augmented.next() # Note: labels aren't augmented, they stay the same
In [55]:
random_number = random.randint(0, 31) # we're making batches of size 32, so we'll get a random instance

# 
# Show original image and augmented image
# 
plt.imshow(images[random_number])
plt.title(f"Original")
plt.axis(False)
plt.figure()
plt.imshow(augmented_images[random_number])
plt.title(f"Augmented")
plt.axis(False);
output png
output png

Build The Model

In [56]:
# 
# mostly the SAME as m5, BUT using the augmented training data
# 
m6 = Sequential([
  # conv-then-max-pool
  Conv2D(10, 3, activation='relu', input_shape=(224, 224, 3)),
  MaxPool2D(pool_size=2), # reduce number of features by half

  # conv-then-max-pool
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),

  # conv-then-max-pool
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  Flatten(),
  Dense(1, activation='sigmoid')
])

# Compile the model
m6.compile(loss='binary_crossentropy',
                optimizer=Adam(),
                metrics=['accuracy'])

# Fit the model
# NOTE: TESTING on non-augmented data
m6History = m6.fit(train_data_augmented, # changed to augmented training data
                        epochs=5,
                        steps_per_epoch=len(train_data_augmented),
                        validation_data=test_data,
                        validation_steps=len(test_data))
Epoch 1/5
47/47 [==============================] - 106s 2s/step - loss: 0.7066 - accuracy: 0.4747 - val_loss: 0.6766 - val_accuracy: 0.5980
Epoch 2/5
47/47 [==============================] - 100s 2s/step - loss: 0.6965 - accuracy: 0.5440 - val_loss: 0.6847 - val_accuracy: 0.5000
Epoch 3/5
47/47 [==============================] - 101s 2s/step - loss: 0.6907 - accuracy: 0.5327 - val_loss: 0.6581 - val_accuracy: 0.5560
Epoch 4/5
47/47 [==============================] - 101s 2s/step - loss: 0.6807 - accuracy: 0.5713 - val_loss: 0.5611 - val_accuracy: 0.6800
Epoch 5/5
47/47 [==============================] - 100s 2s/step - loss: 0.6748 - accuracy: 0.6207 - val_loss: 0.5673 - val_accuracy: 0.7840

Inspect Model

  • the accuracy of m6, .62..., is lower than m5

visualise loss & accuracy

In [57]:
plot_loss_curves(m6History)
output png
output png

Build A Model: Augmented AND shuffled

Shuffling the training data can be one way to reduce any learning influenced by the order of the training data.

Build & Compile

In [60]:
# 
# This is ALMOST identical to the above augmentation
# BUT shuffle is set to TRUE
# 
train_data_augmented_shuffled = train_datagen_augmented.flow_from_directory(train_dir,
                                                                            target_size=(224, 224),
                                                                            batch_size=32,
                                                                            class_mode='binary',
                                                                            shuffle=True) # Shuffle data (default)
Found 1500 images belonging to 2 classes.
In [59]:
# Create the model (same as model_5 and model_6)
m7 = Sequential([
  Conv2D(10, 3, activation='relu', input_shape=(224, 224, 3)),
  MaxPool2D(),
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  Flatten(),
  Dense(1, activation='sigmoid')
])

# Compile the model
m7.compile(loss='binary_crossentropy',
                optimizer=Adam(),
                metrics=['accuracy'])

# Fit the model
m7History = m7.fit(train_data_augmented_shuffled, # now the augmented data is shuffled
                        epochs=5,
                        steps_per_epoch=len(train_data_augmented_shuffled),
                        validation_data=test_data,
                        validation_steps=len(test_data))
Epoch 1/5
47/47 [==============================] - 106s 2s/step - loss: 0.6616 - accuracy: 0.5980 - val_loss: 0.5198 - val_accuracy: 0.7880
Epoch 2/5
47/47 [==============================] - 101s 2s/step - loss: 0.5803 - accuracy: 0.6900 - val_loss: 0.4904 - val_accuracy: 0.7360
Epoch 3/5
47/47 [==============================] - 100s 2s/step - loss: 0.5315 - accuracy: 0.7507 - val_loss: 0.4178 - val_accuracy: 0.8280
Epoch 4/5
47/47 [==============================] - 101s 2s/step - loss: 0.5012 - accuracy: 0.7787 - val_loss: 0.4188 - val_accuracy: 0.8220
Epoch 5/5
47/47 [==============================] - 100s 2s/step - loss: 0.4951 - accuracy: 0.7687 - val_loss: 0.3648 - val_accuracy: 0.8520

Inspect Model

  • the accuracy of m7, .76..., compared to the "baseline" m4 at .95 . the baseline continues to be the best. very interesting.

visualise loss & accuracy

In [61]:
plot_loss_curves(m7History)
output png
output png

The shapes of the loss-curve & accuracy curve look better than the previous model.

Build A Model: Tiny VGG Influence

One way to go about this is to find already-existing model architectures. Models have already been developed for these types of goals, and their model architectures may be available to find online.
The CNN Explainer Website uses a Tiny VGG architecture (some code based on the architecture here). Here, a model based on that architecture, including augmented and shuffled training data.

Build

In [68]:
# Create a CNN model (same as Tiny VGG but for binary classification - https://poloclub.github.io/cnn-explainer/ )
m8 = Sequential([
  Conv2D(10, 3, activation='relu', input_shape=(224, 224, 3)), # same input shape as our images
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  Conv2D(10, 3, activation='relu'),
  Conv2D(10, 3, activation='relu'),
  MaxPool2D(),
  Flatten(),
  Dense(1, activation='sigmoid')
])

# Compile the model
m8.compile(loss="binary_crossentropy",
                optimizer=tf.keras.optimizers.Adam(),
                metrics=["accuracy"])

# Fit the model
m8History = m8.fit(train_data_augmented_shuffled,
                        epochs=5,
                        steps_per_epoch=len(train_data_augmented_shuffled),
                        validation_data=test_data,
                        validation_steps=len(test_data))
Epoch 1/5
47/47 [==============================] - 249s 5s/step - loss: 0.6798 - accuracy: 0.5927 - val_loss: 0.5110 - val_accuracy: 0.7600
Epoch 2/5
47/47 [==============================] - 244s 5s/step - loss: 0.5381 - accuracy: 0.7353 - val_loss: 0.4637 - val_accuracy: 0.7540
Epoch 3/5
47/47 [==============================] - 243s 5s/step - loss: 0.5397 - accuracy: 0.7300 - val_loss: 0.3879 - val_accuracy: 0.8540
Epoch 4/5
47/47 [==============================] - 244s 5s/step - loss: 0.5143 - accuracy: 0.7533 - val_loss: 0.3791 - val_accuracy: 0.8480
Epoch 5/5
47/47 [==============================] - 244s 5s/step - loss: 0.4838 - accuracy: 0.7780 - val_loss: 0.3801 - val_accuracy: 0.8660

Inspect

  • the accuracy of m8, .77..., compared to the "baseline" m4 at .95 . the baseline continues to be the best. very interesting.

visualise loss & accuracy

In [71]:
plot_loss_curves(m8History)
output png
output png
In [74]:
plot_loss_curves(m4History)
output png
output png
In [73]:
m8History.history['loss']
Out [73]:
[0.6797893643379211,
 0.5381284952163696,
 0.5396506786346436,
 0.5142636895179749,
 0.48379725217819214]

Predicting Images with the best model

Get An Image

In [63]:
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/03-steak.jpeg 
steak_to_predict = mpimg.imread("03-steak.jpeg")
plt.imshow(steak_to_predict)
plt.axis(False);
--2024-06-19 14:20:16--  https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/03-steak.jpeg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1978213 (1.9M) [image/jpeg]
Saving to: ‘03-steak.jpeg.1’

03-steak.jpeg.1     100%[===================>]   1.89M  7.88MB/s    in 0.2s    

2024-06-19 14:20:16 (7.88 MB/s) - ‘03-steak.jpeg.1’ saved [1978213/1978213]

output png
In [65]:
steak_to_predict.shape
Out [65]:
(4032, 3024, 3)

Prepare Image For the Model

Here, a function to help prep images:

  • read an image from the file-system
  • convert the image to a tensor, including the expected number of color channels (3)
  • resize the img
  • scale the image tensor values
In [66]:
# import, translate to tensor, resize & rescale
def load_and_prep_image(filename, img_shape=224):
  # Read in target file (an image)
  img = tf.io.read_file(filename)

  # Decode the read file into a tensor & ensure 3 colour channels 
  # (our model is trained on images with 3 colour channels and sometimes images have 4 colour channels)
  img = tf.image.decode_image(img, channels=3)

  # Resize the image (to the same size our model was trained on)
  img = tf.image.resize(img, size = [img_shape, img_shape])

  # Rescale the image (get all values between 0 and 1)
  img = img/255.
  return img
In [67]:
# Load in and preprocess our custom image
preppedSteakImg = load_and_prep_image("03-steak.jpeg")
preppedSteakImg
Out [67]:
<tf.Tensor: shape=(224, 224, 3), dtype=float32, numpy=
array([[[0.6377451 , 0.6220588 , 0.57892156],
        [0.6504902 , 0.63186276, 0.5897059 ],
        [0.63186276, 0.60833335, 0.5612745 ],
        ...,
        [0.52156866, 0.05098039, 0.09019608],
        [0.49509802, 0.04215686, 0.07058824],
        [0.52843136, 0.07745098, 0.10490196]],

       [[0.6617647 , 0.6460784 , 0.6107843 ],
        [0.6387255 , 0.6230392 , 0.57598037],
        [0.65588236, 0.63235295, 0.5852941 ],
        ...,
        [0.5352941 , 0.06862745, 0.09215686],
        [0.529902  , 0.05931373, 0.09460784],
        [0.5142157 , 0.05539216, 0.08676471]],

       [[0.6519608 , 0.6362745 , 0.5892157 ],
        [0.6392157 , 0.6137255 , 0.56764704],
        [0.65637255, 0.6269608 , 0.5828431 ],
        ...,
        [0.53137255, 0.06470589, 0.08039216],
        [0.527451  , 0.06862745, 0.1       ],
        [0.52254903, 0.05196078, 0.0872549 ]],

       ...,

       [[0.49313724, 0.42745098, 0.31029412],
        [0.05441177, 0.01911765, 0.        ],
        [0.2127451 , 0.16176471, 0.09509804],
        ...,
        [0.6132353 , 0.59362745, 0.57009804],
        [0.65294117, 0.6333333 , 0.6098039 ],
        [0.64166665, 0.62990195, 0.59460783]],

       [[0.65392154, 0.5715686 , 0.45      ],
        [0.6367647 , 0.54656863, 0.425     ],
        [0.04656863, 0.01372549, 0.        ],
        ...,
        [0.6372549 , 0.61764705, 0.59411764],
        [0.63529414, 0.6215686 , 0.5892157 ],
        [0.6401961 , 0.62058824, 0.59705883]],

       [[0.1       , 0.05539216, 0.        ],
        [0.48333332, 0.40882352, 0.29117647],
        [0.65      , 0.5686275 , 0.44019607],
        ...,
        [0.6308824 , 0.6161765 , 0.5808824 ],
        [0.6519608 , 0.63186276, 0.5901961 ],
        [0.6338235 , 0.6259804 , 0.57892156]]], dtype=float32)>
In [90]:
# 
# function to convert a predicted value, between 0-1, to the classification 
# AND plot the image on a visual & show the prediction
# 
def pred_and_plot(model, filename, class_names):
  """
  Imports an image located at filename, makes a prediction on it with
  a trained model and plots the image with the predicted class as the title.
  """
  # Import the target image and preprocess it
  img = load_and_prep_image(filename)

  # Make a prediction
  pred = model.predict(tf.expand_dims(img, axis=0))

  # Get the predicted class
  pred_class = class_names[int(tf.round(pred)[0][0])]

  # Plot the image and predicted class
  plt.imshow(img)
  plt.title(f"Prediction: {pred_class}")
  plt.axis(False);

Predict

In [77]:
m8.predict(preppedSteakImg, axis=0)
Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3526, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/tmp/ipykernel_212/2808857747.py", line 1, in <module>
    m8.predict(preppedSteakImg, axis=0)
  File "/opt/conda/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/opt/conda/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 67, in error_handler
    filtered_tb = _process_traceback_frames(e.__traceback__)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Model.predict() got an unexpected keyword argument 'axis'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 2120, in showtraceback
    stb = self.InteractiveTB.structured_traceback(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/IPython/core/ultratb.py", line 1435, in structured_traceback
    return FormattedTB.structured_traceback(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/IPython/core/ultratb.py", line 1326, in structured_traceback
    return VerboseTB.structured_traceback(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/IPython/core/ultratb.py", line 1173, in structured_traceback
    formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/IPython/core/ultratb.py", line 1063, in format_exception_as_a_whole
    self.get_records(etb, number_of_lines_of_context, tb_offset) if etb else []
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/IPython/core/ultratb.py", line 1160, in get_records
    res = list(stack_data.FrameInfo.stack_data(etb, options=options))[tb_offset:]
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/stack_data/core.py", line 597, in stack_data
    yield from collapse_repeated(
  File "/opt/conda/lib/python3.11/site-packages/stack_data/utils.py", line 83, in collapse_repeated
    yield from map(mapper, original_group)
  File "/opt/conda/lib/python3.11/site-packages/stack_data/core.py", line 587, in mapper
    return cls(f, options)
           ^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/stack_data/core.py", line 551, in __init__
    self.executing = Source.executing(frame_or_tb)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/executing/executing.py", line 378, in executing
    assert_(new_stmts <= stmts)
  File "/opt/conda/lib/python3.11/site-packages/executing/executing.py", line 154, in assert_
    raise AssertionError(str(message))
AssertionError

The prediction error here is due to a shape mismatch between the image being predicted and the TRAINED image shapes.

In [78]:
preppedSteakImg.shape
Out [78]:
TensorShape([224, 224, 3])
In [82]:
train_data_augmented_shuffled[0][0].shape
Out [82]:
(32, 224, 224, 3)

the trained image shape is (32, 224, 224, 3) and the predicted image shape is 224, 224, 3.
The difference, there, is that the first number in the trained images is 32, which just-so-happens (not a coincidence) to be the batch number.
In order to get the predicted image shape to match the trained image shape, the expand_dims function can be used:

In [88]:
shapedPredictionImage = tf.expand_dims(preppedSteakImg, axis=0)
shapedPredictionImage.shape
Out [88]:
TensorShape([1, 224, 224, 3])

Predict Again

In [89]:
m8.predict(shapedPredictionImage)
1/1 [==============================] - 0s 87ms/step
Out [89]:
array([[0.66845244]], dtype=float32)
In [92]:
# def pred_and_plot(model, filename, class_names)
pred_and_plot(m8, '03-steak.jpeg', npClassNames)
1/1 [==============================] - 0s 104ms/step
output png
In [94]:
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/03-pizza-dad.jpeg 
pred_and_plot(m8, "03-pizza-dad.jpeg", npClassNames)
--2024-06-19 17:21:04--  https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/03-pizza-dad.jpeg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2874848 (2.7M) [image/jpeg]
Saving to: ‘03-pizza-dad.jpeg.1’

03-pizza-dad.jpeg.1 100%[===================>]   2.74M  9.69MB/s    in 0.3s    

2024-06-19 17:21:04 (9.69 MB/s) - ‘03-pizza-dad.jpeg.1’ saved [2874848/2874848]

1/1 [==============================] - 0s 89ms/step
output png