Transfer Learning with Tensorflow
Notebook Goals
- Describe Transfer Learning & Feature-Extraction transfer learning a bit
- Build & analyze the loss & accuracy of a Transfer-Learning Feature-Extraction Model
- Use the TensorBoard callback to log & visualize model performance
- Have a few resources to leverage while transfer learning: transfer-learning links & tensorboard reference
Transfer Learning
Transfer learning is taking pre-trained models and using them.
Transfer learning allows for using "battle-tested" models, without re-invinting the wheel.
Feature Extraction
When a pretrained model is used, and custom data gets used & applied to the model.
The "underlying patterns" (weights) of the pretrained model get adjusted based on the new data that you/me use from our dataset(s).
The final/last/top layers are the layers that get over-written in feature extraction transfer learning: if the original model is trained on 100 classes but you & i only need 12 based on our use-case, feature extraction allows us to use the majority of the model (layers & weights) and update the output layer to match our needs.
Fine-Tuning
Fine-tuning transfer learning takes the "underlying patterns" (weights) and adjust those. Fine-tuning is a more intense over-writing of the original model than feature extraction.
Transfer learning "steps"
- check out pre-trained models on kaggle
- filter the models by the problem domain in-scope: images, etc.
- consider / select a tensorflow model for this case
Transfer Learning Resources
There are a bunch of pre-biult models available online, built just for tesnorflow. 2 examples:
- ResNetV2
- EfficientNet (also a google blog post about it here)
- tensorflow docs on transfer learning
Imports
#
# NOTE: tensorflow version 2.15 required:
#
# dockerfile
#
# FROM quay.io/jupyter/tensorflow-notebook
# RUN pip uninstall -y tensorflow tf-keras
# RUN pip install tensorflow==2.15 tensorflow_hub
#
import datetime
import zipfile
import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras import layers
import matplotlib.pyplot as plt
print(f"Notebook last run (end-to-end): {datetime.datetime.now()}")
# Use Keras 2.
version_fn = getattr(tf.keras, "version", None)
if version_fn and version_fn().startswith("3."):
print('importing tf_keras')
import tf_keras as keras
else:
print('NOT importing tf_keras')
keras = tf.kerastf.config.list_physical_devices()hub.__version__Get Data
Based on the food101 dataset, here grapping 10% of the data
# Get data (10% of labels)
import zipfile
# Download data
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
# Unzip the downloaded file
zip_ref = zipfile.ZipFile("10_food_classes_10_percent.zip", "r")
zip_ref.extractall()
zip_ref.close()Inspect Downloaded data
for dirpath, dirnames, filenames in os.walk("10_food_classes_10_percent"):
print(f"There are {len(dirnames)} directories and {len(filenames)} images in '{dirpath}'.")Create Image-Loaders
#
# Variables
#
IMG_OUTPUT_SHAPE = (224, 224)
BATCH_SIZE = 32
MODEL_CLASS_MODE = 'categorical'
data_dir_path = "10_food_classes_10_percent/"
train_dir_path = data_dir_path + "train/"
test_dir_path = data_dir_path + "test/"
train_datagenerator = ImageDataGenerator(rescale=1/255.)
test_datagenerator = ImageDataGenerator(rescale=1/255.)
print("Training images:")
train_data_10_percent = train_datagenerator.flow_from_directory(train_dir_path,
target_size=IMG_OUTPUT_SHAPE,
batch_size=BATCH_SIZE,
class_mode=MODEL_CLASS_MODE)
print("Testing images:")
test_data = test_datagenerator.flow_from_directory(test_dir_path,
target_size=IMG_OUTPUT_SHAPE,
batch_size=BATCH_SIZE,
class_mode=MODEL_CLASS_MODE)print(f'How many classes in training data: {train_data_10_percent.num_classes}')Setup Model Callbacks
Tensorflow models use callbacks to interact with the training, during and/or after.
Callbacks can be passed to keras methods such as fit(), evaluate(), and predict() in order to hook into the various stages of the model training, evaluation, and inference lifecycle.
Callback examples:
- ModelCheckpoint: "Callback to save the Keras model or model weights at some frequency."
- LearningRateScheduler: "...Learning rate scheduler."
- CSVLogger: stream epoch results to a csv file (very interesting!)
- TensorBoard: enable visualisation for TensorBoard
Here, a callback set up for tensorboard logging
def create_tensorboard_callback(dir_name, experiment_name):
log_dir = dir_name + "/" + experiment_name + "/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(
log_dir=log_dir
)
print(f"Saving TensorBoard log files to: {log_dir}")
return tensorboard_callbackUse A Pre-built Model
Here, we'll play around with two models: resNet and efficientNet.
# Resnet 50 V2 feature vector
resnet_url = "https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/4"
# Original: EfficientNetB0 feature vector (version 1)
efficientnet_url = "https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1"
# # UPDATED EfficientNetB0 feature vector (version 2)
# efficientnet_url = "https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet1k_b0/feature_vector/2"A model-creator function
Take a url & the number of classes/labels, and return a model that is built on the model-from-url.
def makeModel(model_url, num_classes=10):
"""Takes a TensorFlow Hub URL and creates a Keras Sequential model with it.
Args:
model_url (str): A TensorFlow Hub feature extraction URL.
num_classes (int): Number of output neurons in output layer,
should be equal to number of target classes, default 10.
Returns:
An uncompiled Keras Sequential model with model_url as feature
extractor layer and Dense output layer with num_classes outputs.
"""
# Download the pretrained model and save it as a Keras layer
feature_extractor_layer = hub.KerasLayer(model_url,
trainable=False, # freeze the underlying patterns
name='feature_extraction_layer',
input_shape=IMG_OUTPUT_SHAPE+(3,)) # define the input image shape
# Create our own model
model = tf.keras.Sequential([
feature_extractor_layer, # use the feature extraction layer as the base
layers.Dense(num_classes, activation='softmax', name='output_layer') # create our own output layer
])
return modelCreate The ResNet Transfer-Learning Ready Model
#
# Create model
#
resnet_model = makeModel(resnet_url, num_classes=train_data_10_percent.num_classes)
#
# Compile
#
resnet_model.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
#
# Fit
#
resnet_history = resnet_model.fit(train_data_10_percent,
epochs=5,
steps_per_epoch=len(train_data_10_percent),
validation_data=test_data,
validation_steps=len(test_data),
# Add TensorBoard callback to model (callbacks parameter takes a list)
callbacks=[create_tensorboard_callback(dir_name="tensorflow_hub", # save experiment logs here
experiment_name="resnet50V2")]) # name of log filesInspect Model
Summary
resnet_model.summary()Viz Loss & Accuracy Curves
def plot_loss_curves(history):
chartW = 12
chartH = 3
"""
Returns separate loss curves for training and validation metrics.
"""
loss = history.history['loss']
val_loss = history.history['val_loss']
accuracy = history.history['accuracy']
val_accuracy = history.history['val_accuracy']
epochs = range(len(history.history['loss']))
plt.figure(figsize=(chartW, chartH))
plt.subplot(1, 2, 1)
# Plot loss
plt.plot(epochs, loss, label='training_loss')
plt.plot(epochs, val_loss, label='val_loss')
plt.title('Loss')
plt.xlabel('Epochs')
plt.legend()
# Plot accuracy
plt.subplot(1, 2, 2)
plt.plot(epochs, accuracy, label='training_accuracy')
plt.plot(epochs, val_accuracy, label='val_accuracy')
plt.title('Accuracy')
plt.xlabel('Epochs')
plt.legend();plot_loss_curves(resnet_history)- the Loss curves MIGHT be indicating that the model is learning the training data better than the testing data: could be overfitting
Use Pre-Built Model II: EfficientNet
# Create model
efficientnet_model = makeModel(model_url=efficientnet_url, # use EfficientNetB0 TensorFlow Hub URL
num_classes=train_data_10_percent.num_classes)
# Compile EfficientNet model
efficientnet_model.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
# Fit EfficientNet model
efficientnet_history = efficientnet_model.fit(train_data_10_percent, # only use 10% of training data
epochs=5, # train for 5 epochs
steps_per_epoch=len(train_data_10_percent),
validation_data=test_data,
validation_steps=len(test_data),
callbacks=[create_tensorboard_callback(dir_name="tensorflow_hub",
# Track logs under different experiment name
experiment_name="efficientnetB0")])Inspect the Model
Summary
efficientnet_model.summary()Viz Loss & Accuracy Curves
plot_loss_curves(efficientnet_history)View analysis in tensorboard
Check out this notebook for ideas on getting tensorboard connected: here