Table Of Contents
- Transfer Learning: Fine-Tuning
- Notebook Goals
- Functional API for Model-Creation
- A "Sequential" Approach
- A Functional Approach
- Functional & Sequential Syntax Comparison
- DEFINITION: feature vector
- Imports
- Download & Import Helper Functions
- Get Data
- Download the data
- Setup Some Variables
- Split Data Into Train & Test
- Inspect the created dataset vars
- Model: Transfer-Learning I
- Inspect The Model & Results
- Model Layers
- Model Summary
- Visualize Loss & Accuracy Curves
- A Layer-in-action: GlobalAveragePooling2D
- Reduce Mean to get the same
- A Layer-in-action: GlobalMaxPool2D
Transfer Learning: Fine-Tuning
Fine-tuning is a type of transfer learning.
Transfer learning (briefly):
- uses existing pre-trained models
- adjusts the use of the model to your/my use-case
Fine-Tuning adjusts layers in the starting model. Fine-tuning adjusts more layers than a "feature extraction" transfer learning setup.
Notebook Goals
-
review differences between the sequential and functional apis for tf keras model development
-
import a series of "helper" functions from an external source
-
import & split data into training & testing datasets using
tf.keras.preprocessing.image_dataset_from_directory -
build a model using the keras "functional" style api
-
save model logs to a csv file using a csv callback function
Functional API for Model-Creation
A "Sequential" Approach
# CREATE
sequential_model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation="relu"),
tf.keras.layers.Dense(64, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax"),
], name="sequential_model")
# COMPILE
sequential_model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# FIT to data
sequential_model.fit(x_training_data, y_training_data, batch_size=32, epoch=5)
A Functional Approach
# CREATE
inputLayer = tf.keras.layers.Input(shape=(28,28))
appliedModel = tf.keras.layers.Flatten(inputLayer)
appliedModel = tf.keras.layers.Dense(64, activation="relu")(appliedModel)
appliedModel = tf.keras.layers.Dense(64, activation="relu")(appliedModel)
appliedOutput = tf.keras.layers.Dense(10, activation="softmax")(appliedModel)
functional_model = tf.keras.Model(inputLayer, appliedOutput, name="functional_model")
# COMPILE
functional_model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# FIT to data
functional_model.fit(x_training_data, y_training_data, batch_size=32, epoch=5)
Functional & Sequential Syntax Comparison
- compile and fit are the same
- functional "is more flexible"
DEFINITION: feature vector
a "feature" representation of input data
Imports
import tensorflow as tf
from keras.callbacks import CSVLoggerDownload & Import Helper Functions
# Download helper_functions.py script
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py
# Import helper functions we're going to use
from helper_functions import create_tensorboard_callback, plot_loss_curves, unzip_data, walk_through_dir--2024-06-26 11:23:06-- https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 10246 (10K) [text/plain] Saving to: ‘helper_functions.py.1’ helper_functions.py 100%[===================>] 10.01K --.-KB/s in 0.001s 2024-06-26 11:23:06 (11.4 MB/s) - ‘helper_functions.py.1’ saved [10246/10246]
Get Data
Transfer learning can leverage less training data than building a model from scratch.
Download the data
# Get SMALL set of data based on the food101 dataset
# !wget https://storage.googleapis.com/ztm_tf_course/food_vision/10_food_classes_10_percent.zip
# unzip_data("10_food_classes_10_percent.zip")#
# check out some data
#
walk_through_dir("10_food_classes_10_percent")There are 2 directories and 0 images in '10_food_classes_10_percent'. There are 10 directories and 0 images in '10_food_classes_10_percent/test'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/ice_cream'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/chicken_curry'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/steak'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/sushi'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/chicken_wings'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/grilled_salmon'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/hamburger'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/pizza'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/ramen'. There are 0 directories and 250 images in '10_food_classes_10_percent/test/fried_rice'. There are 10 directories and 0 images in '10_food_classes_10_percent/train'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/ice_cream'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/chicken_curry'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/steak'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/sushi'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/chicken_wings'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/grilled_salmon'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/hamburger'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/pizza'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/ramen'. There are 0 directories and 75 images in '10_food_classes_10_percent/train/fried_rice'.
Setup Some Variables
data_dir_path = "10_food_classes_10_percent/"
train_dir_path = data_dir_path + "train/"
test_dir_path = data_dir_path + "test/"
IMG_OUTPUT_SIZE = (224, 224)
labelMode = "categorical"
# batchSize = 32
batchSize = 24Split Data Into Train & Test
Using the image_dataset_from_directory, we can create 2 tensorflow datasets
train_data_10_percent = tf.keras.preprocessing.image_dataset_from_directory(directory=train_dir_path,
image_size=IMG_OUTPUT_SIZE,
label_mode=labelMode,
batch_size=batchSize)
test_data_10_percent = tf.keras.preprocessing.image_dataset_from_directory(directory=test_dir_path,
image_size=IMG_OUTPUT_SIZE,
label_mode=labelMode)Found 750 files belonging to 10 classes. Found 2500 files belonging to 10 classes.
Inspect the created dataset vars
# see what one of the vars is...
train_data_10_percent<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 10), dtype=tf.float32, name=None))>
train_data_10_percent.class_names['chicken_curry', 'chicken_wings', 'fried_rice', 'grilled_salmon', 'hamburger', 'ice_cream', 'pizza', 'ramen', 'steak', 'sushi']
#
# see ALL methods available on the new vars
#
# dir(train_data_10_percent)#
# preview some data using the "take" method
#
# for images, labels in train_data_10_percent.take(1):
# print(images,labels)Model: Transfer-Learning I
modelName = 'm0'
lessValidationDataCount = int(0.25 * len(test_data_10_percent))
csv_logger = CSVLogger(f'{modelName}-log.csv', append=True, separator=';')
#
# pre-trained model
#
# 1. Create base model with tf.keras.applications
# https://www.tensorflow.org/api_docs/python/tf/keras/applications/EfficientNetV2B0
base_model = tf.keras.applications.efficientnet_v2.EfficientNetV2B0(include_top=False)
# 2. Freeze the base model (so the pre-learned patterns remain)
base_model.trainable = False
#
# custom layer(s)
#
# 3. Create inputLayer into the base model
inputLayer = tf.keras.layers.Input(shape=(224, 224, 3), name="input_layer")
# "4" If using ResNet50V2, add this to speed up convergence by rescaling inputs
# NOT for EfficientNetV2
# appliedModel = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)(inputLayer)
# 5. Apply the inputLayer to the base_model (note: using tf.keras.applications, EfficientNetV2 inputLayer don't have to be normalized)
appliedModel = base_model(inputLayer)
# Check data shape after passing it to base_model
print(f"Shape after base_model: {x.shape}")
# 6. Average pool the outputs of the base model (aggregate all the most important information, reduce number of computations)
appliedModel = tf.keras.layers.GlobalAveragePooling2D(name="global_average_pooling_layer")(appliedModel)
print(f"After GlobalAveragePooling2D(): {x.shape}")
# 7. Create the output activation layer
outputs = tf.keras.layers.Dense(10, activation="softmax", name="output_layer")(appliedModel)
# 8. Combine the inputLayer with the outputs into a model
m0 = tf.keras.Model(inputLayer, outputs)
# 9. Compile the model
m0.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# 10. Fit the model (we use less steps for validation so it's faster)
m0History = m0.fit(train_data_10_percent,
epochs=5,
steps_per_epoch=len(train_data_10_percent),
validation_data=test_data_10_percent,
# Go through LESS of the validation data so epochs are faster (we want faster experiments!)
validation_steps=lessValidationDataCount,
# Track our model's training logs for visualization later
callbacks=[create_tensorboard_callback("transfer_learning", "10_percent_feature_extract"), csv_logger])Shape after base_model: (None, 1280) After GlobalAveragePooling2D(): (None, 1280) Saving TensorBoard log files to: transfer_learning/10_percent_feature_extract/20240626-121903 Epoch 1/5 32/32 [==============================] - 43s 1s/step - loss: 1.8216 - accuracy: 0.4587 - val_loss: 1.2248 - val_accuracy: 0.7566 Epoch 2/5 32/32 [==============================] - 34s 1s/step - loss: 1.0554 - accuracy: 0.7533 - val_loss: 0.8196 - val_accuracy: 0.8240 Epoch 3/5 32/32 [==============================] - 41s 1s/step - loss: 0.7753 - accuracy: 0.8320 - val_loss: 0.6813 - val_accuracy: 0.8487 Epoch 4/5 32/32 [==============================] - 34s 1s/step - loss: 0.6335 - accuracy: 0.8613 - val_loss: 0.5982 - val_accuracy: 0.8470 Epoch 5/5 32/32 [==============================] - 34s 1s/step - loss: 0.5509 - accuracy: 0.8867 - val_loss: 0.5562 - val_accuracy: 0.8635
Inspect The Model & Results
Model Layers
for layer_number, layer in enumerate(base_model.layers):
print(layer_number, layer.name)0 input_1 1 rescaling 2 normalization 3 stem_conv 4 stem_bn 5 stem_activation 6 block1a_project_conv 7 block1a_project_bn 8 block1a_project_activation 9 block2a_expand_conv 10 block2a_expand_bn 11 block2a_expand_activation 12 block2a_project_conv 13 block2a_project_bn 14 block2b_expand_conv 15 block2b_expand_bn 16 block2b_expand_activation 17 block2b_project_conv 18 block2b_project_bn 19 block2b_drop 20 block2b_add 21 block3a_expand_conv 22 block3a_expand_bn 23 block3a_expand_activation 24 block3a_project_conv 25 block3a_project_bn 26 block3b_expand_conv 27 block3b_expand_bn 28 block3b_expand_activation 29 block3b_project_conv 30 block3b_project_bn 31 block3b_drop 32 block3b_add 33 block4a_expand_conv 34 block4a_expand_bn 35 block4a_expand_activation 36 block4a_dwconv2 37 block4a_bn 38 block4a_activation 39 block4a_se_squeeze 40 block4a_se_reshape 41 block4a_se_reduce 42 block4a_se_expand 43 block4a_se_excite 44 block4a_project_conv 45 block4a_project_bn 46 block4b_expand_conv 47 block4b_expand_bn 48 block4b_expand_activation 49 block4b_dwconv2 50 block4b_bn 51 block4b_activation 52 block4b_se_squeeze 53 block4b_se_reshape 54 block4b_se_reduce 55 block4b_se_expand 56 block4b_se_excite 57 block4b_project_conv 58 block4b_project_bn 59 block4b_drop 60 block4b_add 61 block4c_expand_conv 62 block4c_expand_bn 63 block4c_expand_activation 64 block4c_dwconv2 65 block4c_bn 66 block4c_activation 67 block4c_se_squeeze 68 block4c_se_reshape 69 block4c_se_reduce 70 block4c_se_expand 71 block4c_se_excite 72 block4c_project_conv 73 block4c_project_bn 74 block4c_drop 75 block4c_add 76 block5a_expand_conv 77 block5a_expand_bn 78 block5a_expand_activation 79 block5a_dwconv2 80 block5a_bn 81 block5a_activation 82 block5a_se_squeeze 83 block5a_se_reshape 84 block5a_se_reduce 85 block5a_se_expand 86 block5a_se_excite 87 block5a_project_conv 88 block5a_project_bn 89 block5b_expand_conv 90 block5b_expand_bn 91 block5b_expand_activation 92 block5b_dwconv2 93 block5b_bn 94 block5b_activation 95 block5b_se_squeeze 96 block5b_se_reshape 97 block5b_se_reduce 98 block5b_se_expand 99 block5b_se_excite 100 block5b_project_conv 101 block5b_project_bn 102 block5b_drop 103 block5b_add 104 block5c_expand_conv 105 block5c_expand_bn 106 block5c_expand_activation 107 block5c_dwconv2 108 block5c_bn 109 block5c_activation 110 block5c_se_squeeze 111 block5c_se_reshape 112 block5c_se_reduce 113 block5c_se_expand 114 block5c_se_excite 115 block5c_project_conv 116 block5c_project_bn 117 block5c_drop 118 block5c_add 119 block5d_expand_conv 120 block5d_expand_bn 121 block5d_expand_activation 122 block5d_dwconv2 123 block5d_bn 124 block5d_activation 125 block5d_se_squeeze 126 block5d_se_reshape 127 block5d_se_reduce 128 block5d_se_expand 129 block5d_se_excite 130 block5d_project_conv 131 block5d_project_bn 132 block5d_drop 133 block5d_add 134 block5e_expand_conv 135 block5e_expand_bn 136 block5e_expand_activation 137 block5e_dwconv2 138 block5e_bn 139 block5e_activation 140 block5e_se_squeeze 141 block5e_se_reshape 142 block5e_se_reduce 143 block5e_se_expand 144 block5e_se_excite 145 block5e_project_conv 146 block5e_project_bn 147 block5e_drop 148 block5e_add 149 block6a_expand_conv 150 block6a_expand_bn 151 block6a_expand_activation 152 block6a_dwconv2 153 block6a_bn 154 block6a_activation 155 block6a_se_squeeze 156 block6a_se_reshape 157 block6a_se_reduce 158 block6a_se_expand 159 block6a_se_excite 160 block6a_project_conv 161 block6a_project_bn 162 block6b_expand_conv 163 block6b_expand_bn 164 block6b_expand_activation 165 block6b_dwconv2 166 block6b_bn 167 block6b_activation 168 block6b_se_squeeze 169 block6b_se_reshape 170 block6b_se_reduce 171 block6b_se_expand 172 block6b_se_excite 173 block6b_project_conv 174 block6b_project_bn 175 block6b_drop 176 block6b_add 177 block6c_expand_conv 178 block6c_expand_bn 179 block6c_expand_activation 180 block6c_dwconv2 181 block6c_bn 182 block6c_activation 183 block6c_se_squeeze 184 block6c_se_reshape 185 block6c_se_reduce 186 block6c_se_expand 187 block6c_se_excite 188 block6c_project_conv 189 block6c_project_bn 190 block6c_drop 191 block6c_add 192 block6d_expand_conv 193 block6d_expand_bn 194 block6d_expand_activation 195 block6d_dwconv2 196 block6d_bn 197 block6d_activation 198 block6d_se_squeeze 199 block6d_se_reshape 200 block6d_se_reduce 201 block6d_se_expand 202 block6d_se_excite 203 block6d_project_conv 204 block6d_project_bn 205 block6d_drop 206 block6d_add 207 block6e_expand_conv 208 block6e_expand_bn 209 block6e_expand_activation 210 block6e_dwconv2 211 block6e_bn 212 block6e_activation 213 block6e_se_squeeze 214 block6e_se_reshape 215 block6e_se_reduce 216 block6e_se_expand 217 block6e_se_excite 218 block6e_project_conv 219 block6e_project_bn 220 block6e_drop 221 block6e_add 222 block6f_expand_conv 223 block6f_expand_bn 224 block6f_expand_activation 225 block6f_dwconv2 226 block6f_bn 227 block6f_activation 228 block6f_se_squeeze 229 block6f_se_reshape 230 block6f_se_reduce 231 block6f_se_expand 232 block6f_se_excite 233 block6f_project_conv 234 block6f_project_bn 235 block6f_drop 236 block6f_add 237 block6g_expand_conv 238 block6g_expand_bn 239 block6g_expand_activation 240 block6g_dwconv2 241 block6g_bn 242 block6g_activation 243 block6g_se_squeeze 244 block6g_se_reshape 245 block6g_se_reduce 246 block6g_se_expand 247 block6g_se_excite 248 block6g_project_conv 249 block6g_project_bn 250 block6g_drop 251 block6g_add 252 block6h_expand_conv 253 block6h_expand_bn 254 block6h_expand_activation 255 block6h_dwconv2 256 block6h_bn 257 block6h_activation 258 block6h_se_squeeze 259 block6h_se_reshape 260 block6h_se_reduce 261 block6h_se_expand 262 block6h_se_excite 263 block6h_project_conv 264 block6h_project_bn 265 block6h_drop 266 block6h_add 267 top_conv 268 top_bn 269 top_activation
Model Summary
base_model.summary()Model: "efficientnetv2-b0"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, None, None, 3)] 0 []
rescaling (Rescaling) (None, None, None, 3) 0 ['input_1[0][0]']
normalization (Normalizati (None, None, None, 3) 0 ['rescaling[0][0]']
on)
stem_conv (Conv2D) (None, None, None, 32) 864 ['normalization[0][0]']
stem_bn (BatchNormalizatio (None, None, None, 32) 128 ['stem_conv[0][0]']
n)
stem_activation (Activatio (None, None, None, 32) 0 ['stem_bn[0][0]']
n)
block1a_project_conv (Conv (None, None, None, 16) 4608 ['stem_activation[0][0]']
2D)
block1a_project_bn (BatchN (None, None, None, 16) 64 ['block1a_project_conv[0][0]']
ormalization)
block1a_project_activation (None, None, None, 16) 0 ['block1a_project_bn[0][0]']
(Activation)
block2a_expand_conv (Conv2 (None, None, None, 64) 9216 ['block1a_project_activation[0
D) ][0]']
block2a_expand_bn (BatchNo (None, None, None, 64) 256 ['block2a_expand_conv[0][0]']
rmalization)
block2a_expand_activation (None, None, None, 64) 0 ['block2a_expand_bn[0][0]']
(Activation)
block2a_project_conv (Conv (None, None, None, 32) 2048 ['block2a_expand_activation[0]
2D) [0]']
block2a_project_bn (BatchN (None, None, None, 32) 128 ['block2a_project_conv[0][0]']
ormalization)
block2b_expand_conv (Conv2 (None, None, None, 128) 36864 ['block2a_project_bn[0][0]']
D)
block2b_expand_bn (BatchNo (None, None, None, 128) 512 ['block2b_expand_conv[0][0]']
rmalization)
block2b_expand_activation (None, None, None, 128) 0 ['block2b_expand_bn[0][0]']
(Activation)
block2b_project_conv (Conv (None, None, None, 32) 4096 ['block2b_expand_activation[0]
2D) [0]']
block2b_project_bn (BatchN (None, None, None, 32) 128 ['block2b_project_conv[0][0]']
ormalization)
block2b_drop (Dropout) (None, None, None, 32) 0 ['block2b_project_bn[0][0]']
block2b_add (Add) (None, None, None, 32) 0 ['block2b_drop[0][0]',
'block2a_project_bn[0][0]']
block3a_expand_conv (Conv2 (None, None, None, 128) 36864 ['block2b_add[0][0]']
D)
block3a_expand_bn (BatchNo (None, None, None, 128) 512 ['block3a_expand_conv[0][0]']
rmalization)
block3a_expand_activation (None, None, None, 128) 0 ['block3a_expand_bn[0][0]']
(Activation)
block3a_project_conv (Conv (None, None, None, 48) 6144 ['block3a_expand_activation[0]
2D) [0]']
block3a_project_bn (BatchN (None, None, None, 48) 192 ['block3a_project_conv[0][0]']
ormalization)
block3b_expand_conv (Conv2 (None, None, None, 192) 82944 ['block3a_project_bn[0][0]']
D)
block3b_expand_bn (BatchNo (None, None, None, 192) 768 ['block3b_expand_conv[0][0]']
rmalization)
block3b_expand_activation (None, None, None, 192) 0 ['block3b_expand_bn[0][0]']
(Activation)
block3b_project_conv (Conv (None, None, None, 48) 9216 ['block3b_expand_activation[0]
2D) [0]']
block3b_project_bn (BatchN (None, None, None, 48) 192 ['block3b_project_conv[0][0]']
ormalization)
block3b_drop (Dropout) (None, None, None, 48) 0 ['block3b_project_bn[0][0]']
block3b_add (Add) (None, None, None, 48) 0 ['block3b_drop[0][0]',
'block3a_project_bn[0][0]']
block4a_expand_conv (Conv2 (None, None, None, 192) 9216 ['block3b_add[0][0]']
D)
block4a_expand_bn (BatchNo (None, None, None, 192) 768 ['block4a_expand_conv[0][0]']
rmalization)
block4a_expand_activation (None, None, None, 192) 0 ['block4a_expand_bn[0][0]']
(Activation)
block4a_dwconv2 (Depthwise (None, None, None, 192) 1728 ['block4a_expand_activation[0]
Conv2D) [0]']
block4a_bn (BatchNormaliza (None, None, None, 192) 768 ['block4a_dwconv2[0][0]']
tion)
block4a_activation (Activa (None, None, None, 192) 0 ['block4a_bn[0][0]']
tion)
block4a_se_squeeze (Global (None, 192) 0 ['block4a_activation[0][0]']
AveragePooling2D)
block4a_se_reshape (Reshap (None, 1, 1, 192) 0 ['block4a_se_squeeze[0][0]']
e)
block4a_se_reduce (Conv2D) (None, 1, 1, 12) 2316 ['block4a_se_reshape[0][0]']
block4a_se_expand (Conv2D) (None, 1, 1, 192) 2496 ['block4a_se_reduce[0][0]']
block4a_se_excite (Multipl (None, None, None, 192) 0 ['block4a_activation[0][0]',
y) 'block4a_se_expand[0][0]']
block4a_project_conv (Conv (None, None, None, 96) 18432 ['block4a_se_excite[0][0]']
2D)
block4a_project_bn (BatchN (None, None, None, 96) 384 ['block4a_project_conv[0][0]']
ormalization)
block4b_expand_conv (Conv2 (None, None, None, 384) 36864 ['block4a_project_bn[0][0]']
D)
block4b_expand_bn (BatchNo (None, None, None, 384) 1536 ['block4b_expand_conv[0][0]']
rmalization)
block4b_expand_activation (None, None, None, 384) 0 ['block4b_expand_bn[0][0]']
(Activation)
block4b_dwconv2 (Depthwise (None, None, None, 384) 3456 ['block4b_expand_activation[0]
Conv2D) [0]']
block4b_bn (BatchNormaliza (None, None, None, 384) 1536 ['block4b_dwconv2[0][0]']
tion)
block4b_activation (Activa (None, None, None, 384) 0 ['block4b_bn[0][0]']
tion)
block4b_se_squeeze (Global (None, 384) 0 ['block4b_activation[0][0]']
AveragePooling2D)
block4b_se_reshape (Reshap (None, 1, 1, 384) 0 ['block4b_se_squeeze[0][0]']
e)
block4b_se_reduce (Conv2D) (None, 1, 1, 24) 9240 ['block4b_se_reshape[0][0]']
block4b_se_expand (Conv2D) (None, 1, 1, 384) 9600 ['block4b_se_reduce[0][0]']
block4b_se_excite (Multipl (None, None, None, 384) 0 ['block4b_activation[0][0]',
y) 'block4b_se_expand[0][0]']
block4b_project_conv (Conv (None, None, None, 96) 36864 ['block4b_se_excite[0][0]']
2D)
block4b_project_bn (BatchN (None, None, None, 96) 384 ['block4b_project_conv[0][0]']
ormalization)
block4b_drop (Dropout) (None, None, None, 96) 0 ['block4b_project_bn[0][0]']
block4b_add (Add) (None, None, None, 96) 0 ['block4b_drop[0][0]',
'block4a_project_bn[0][0]']
block4c_expand_conv (Conv2 (None, None, None, 384) 36864 ['block4b_add[0][0]']
D)
block4c_expand_bn (BatchNo (None, None, None, 384) 1536 ['block4c_expand_conv[0][0]']
rmalization)
block4c_expand_activation (None, None, None, 384) 0 ['block4c_expand_bn[0][0]']
(Activation)
block4c_dwconv2 (Depthwise (None, None, None, 384) 3456 ['block4c_expand_activation[0]
Conv2D) [0]']
block4c_bn (BatchNormaliza (None, None, None, 384) 1536 ['block4c_dwconv2[0][0]']
tion)
block4c_activation (Activa (None, None, None, 384) 0 ['block4c_bn[0][0]']
tion)
block4c_se_squeeze (Global (None, 384) 0 ['block4c_activation[0][0]']
AveragePooling2D)
block4c_se_reshape (Reshap (None, 1, 1, 384) 0 ['block4c_se_squeeze[0][0]']
e)
block4c_se_reduce (Conv2D) (None, 1, 1, 24) 9240 ['block4c_se_reshape[0][0]']
block4c_se_expand (Conv2D) (None, 1, 1, 384) 9600 ['block4c_se_reduce[0][0]']
block4c_se_excite (Multipl (None, None, None, 384) 0 ['block4c_activation[0][0]',
y) 'block4c_se_expand[0][0]']
block4c_project_conv (Conv (None, None, None, 96) 36864 ['block4c_se_excite[0][0]']
2D)
block4c_project_bn (BatchN (None, None, None, 96) 384 ['block4c_project_conv[0][0]']
ormalization)
block4c_drop (Dropout) (None, None, None, 96) 0 ['block4c_project_bn[0][0]']
block4c_add (Add) (None, None, None, 96) 0 ['block4c_drop[0][0]',
'block4b_add[0][0]']
block5a_expand_conv (Conv2 (None, None, None, 576) 55296 ['block4c_add[0][0]']
D)
block5a_expand_bn (BatchNo (None, None, None, 576) 2304 ['block5a_expand_conv[0][0]']
rmalization)
block5a_expand_activation (None, None, None, 576) 0 ['block5a_expand_bn[0][0]']
(Activation)
block5a_dwconv2 (Depthwise (None, None, None, 576) 5184 ['block5a_expand_activation[0]
Conv2D) [0]']
block5a_bn (BatchNormaliza (None, None, None, 576) 2304 ['block5a_dwconv2[0][0]']
tion)
block5a_activation (Activa (None, None, None, 576) 0 ['block5a_bn[0][0]']
tion)
block5a_se_squeeze (Global (None, 576) 0 ['block5a_activation[0][0]']
AveragePooling2D)
block5a_se_reshape (Reshap (None, 1, 1, 576) 0 ['block5a_se_squeeze[0][0]']
e)
block5a_se_reduce (Conv2D) (None, 1, 1, 24) 13848 ['block5a_se_reshape[0][0]']
block5a_se_expand (Conv2D) (None, 1, 1, 576) 14400 ['block5a_se_reduce[0][0]']
block5a_se_excite (Multipl (None, None, None, 576) 0 ['block5a_activation[0][0]',
y) 'block5a_se_expand[0][0]']
block5a_project_conv (Conv (None, None, None, 112) 64512 ['block5a_se_excite[0][0]']
2D)
block5a_project_bn (BatchN (None, None, None, 112) 448 ['block5a_project_conv[0][0]']
ormalization)
block5b_expand_conv (Conv2 (None, None, None, 672) 75264 ['block5a_project_bn[0][0]']
D)
block5b_expand_bn (BatchNo (None, None, None, 672) 2688 ['block5b_expand_conv[0][0]']
rmalization)
block5b_expand_activation (None, None, None, 672) 0 ['block5b_expand_bn[0][0]']
(Activation)
block5b_dwconv2 (Depthwise (None, None, None, 672) 6048 ['block5b_expand_activation[0]
Conv2D) [0]']
block5b_bn (BatchNormaliza (None, None, None, 672) 2688 ['block5b_dwconv2[0][0]']
tion)
block5b_activation (Activa (None, None, None, 672) 0 ['block5b_bn[0][0]']
tion)
block5b_se_squeeze (Global (None, 672) 0 ['block5b_activation[0][0]']
AveragePooling2D)
block5b_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5b_se_squeeze[0][0]']
e)
block5b_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5b_se_reshape[0][0]']
block5b_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5b_se_reduce[0][0]']
block5b_se_excite (Multipl (None, None, None, 672) 0 ['block5b_activation[0][0]',
y) 'block5b_se_expand[0][0]']
block5b_project_conv (Conv (None, None, None, 112) 75264 ['block5b_se_excite[0][0]']
2D)
block5b_project_bn (BatchN (None, None, None, 112) 448 ['block5b_project_conv[0][0]']
ormalization)
block5b_drop (Dropout) (None, None, None, 112) 0 ['block5b_project_bn[0][0]']
block5b_add (Add) (None, None, None, 112) 0 ['block5b_drop[0][0]',
'block5a_project_bn[0][0]']
block5c_expand_conv (Conv2 (None, None, None, 672) 75264 ['block5b_add[0][0]']
D)
block5c_expand_bn (BatchNo (None, None, None, 672) 2688 ['block5c_expand_conv[0][0]']
rmalization)
block5c_expand_activation (None, None, None, 672) 0 ['block5c_expand_bn[0][0]']
(Activation)
block5c_dwconv2 (Depthwise (None, None, None, 672) 6048 ['block5c_expand_activation[0]
Conv2D) [0]']
block5c_bn (BatchNormaliza (None, None, None, 672) 2688 ['block5c_dwconv2[0][0]']
tion)
block5c_activation (Activa (None, None, None, 672) 0 ['block5c_bn[0][0]']
tion)
block5c_se_squeeze (Global (None, 672) 0 ['block5c_activation[0][0]']
AveragePooling2D)
block5c_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5c_se_squeeze[0][0]']
e)
block5c_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5c_se_reshape[0][0]']
block5c_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5c_se_reduce[0][0]']
block5c_se_excite (Multipl (None, None, None, 672) 0 ['block5c_activation[0][0]',
y) 'block5c_se_expand[0][0]']
block5c_project_conv (Conv (None, None, None, 112) 75264 ['block5c_se_excite[0][0]']
2D)
block5c_project_bn (BatchN (None, None, None, 112) 448 ['block5c_project_conv[0][0]']
ormalization)
block5c_drop (Dropout) (None, None, None, 112) 0 ['block5c_project_bn[0][0]']
block5c_add (Add) (None, None, None, 112) 0 ['block5c_drop[0][0]',
'block5b_add[0][0]']
block5d_expand_conv (Conv2 (None, None, None, 672) 75264 ['block5c_add[0][0]']
D)
block5d_expand_bn (BatchNo (None, None, None, 672) 2688 ['block5d_expand_conv[0][0]']
rmalization)
block5d_expand_activation (None, None, None, 672) 0 ['block5d_expand_bn[0][0]']
(Activation)
block5d_dwconv2 (Depthwise (None, None, None, 672) 6048 ['block5d_expand_activation[0]
Conv2D) [0]']
block5d_bn (BatchNormaliza (None, None, None, 672) 2688 ['block5d_dwconv2[0][0]']
tion)
block5d_activation (Activa (None, None, None, 672) 0 ['block5d_bn[0][0]']
tion)
block5d_se_squeeze (Global (None, 672) 0 ['block5d_activation[0][0]']
AveragePooling2D)
block5d_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5d_se_squeeze[0][0]']
e)
block5d_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5d_se_reshape[0][0]']
block5d_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5d_se_reduce[0][0]']
block5d_se_excite (Multipl (None, None, None, 672) 0 ['block5d_activation[0][0]',
y) 'block5d_se_expand[0][0]']
block5d_project_conv (Conv (None, None, None, 112) 75264 ['block5d_se_excite[0][0]']
2D)
block5d_project_bn (BatchN (None, None, None, 112) 448 ['block5d_project_conv[0][0]']
ormalization)
block5d_drop (Dropout) (None, None, None, 112) 0 ['block5d_project_bn[0][0]']
block5d_add (Add) (None, None, None, 112) 0 ['block5d_drop[0][0]',
'block5c_add[0][0]']
block5e_expand_conv (Conv2 (None, None, None, 672) 75264 ['block5d_add[0][0]']
D)
block5e_expand_bn (BatchNo (None, None, None, 672) 2688 ['block5e_expand_conv[0][0]']
rmalization)
block5e_expand_activation (None, None, None, 672) 0 ['block5e_expand_bn[0][0]']
(Activation)
block5e_dwconv2 (Depthwise (None, None, None, 672) 6048 ['block5e_expand_activation[0]
Conv2D) [0]']
block5e_bn (BatchNormaliza (None, None, None, 672) 2688 ['block5e_dwconv2[0][0]']
tion)
block5e_activation (Activa (None, None, None, 672) 0 ['block5e_bn[0][0]']
tion)
block5e_se_squeeze (Global (None, 672) 0 ['block5e_activation[0][0]']
AveragePooling2D)
block5e_se_reshape (Reshap (None, 1, 1, 672) 0 ['block5e_se_squeeze[0][0]']
e)
block5e_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block5e_se_reshape[0][0]']
block5e_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block5e_se_reduce[0][0]']
block5e_se_excite (Multipl (None, None, None, 672) 0 ['block5e_activation[0][0]',
y) 'block5e_se_expand[0][0]']
block5e_project_conv (Conv (None, None, None, 112) 75264 ['block5e_se_excite[0][0]']
2D)
block5e_project_bn (BatchN (None, None, None, 112) 448 ['block5e_project_conv[0][0]']
ormalization)
block5e_drop (Dropout) (None, None, None, 112) 0 ['block5e_project_bn[0][0]']
block5e_add (Add) (None, None, None, 112) 0 ['block5e_drop[0][0]',
'block5d_add[0][0]']
block6a_expand_conv (Conv2 (None, None, None, 672) 75264 ['block5e_add[0][0]']
D)
block6a_expand_bn (BatchNo (None, None, None, 672) 2688 ['block6a_expand_conv[0][0]']
rmalization)
block6a_expand_activation (None, None, None, 672) 0 ['block6a_expand_bn[0][0]']
(Activation)
block6a_dwconv2 (Depthwise (None, None, None, 672) 6048 ['block6a_expand_activation[0]
Conv2D) [0]']
block6a_bn (BatchNormaliza (None, None, None, 672) 2688 ['block6a_dwconv2[0][0]']
tion)
block6a_activation (Activa (None, None, None, 672) 0 ['block6a_bn[0][0]']
tion)
block6a_se_squeeze (Global (None, 672) 0 ['block6a_activation[0][0]']
AveragePooling2D)
block6a_se_reshape (Reshap (None, 1, 1, 672) 0 ['block6a_se_squeeze[0][0]']
e)
block6a_se_reduce (Conv2D) (None, 1, 1, 28) 18844 ['block6a_se_reshape[0][0]']
block6a_se_expand (Conv2D) (None, 1, 1, 672) 19488 ['block6a_se_reduce[0][0]']
block6a_se_excite (Multipl (None, None, None, 672) 0 ['block6a_activation[0][0]',
y) 'block6a_se_expand[0][0]']
block6a_project_conv (Conv (None, None, None, 192) 129024 ['block6a_se_excite[0][0]']
2D)
block6a_project_bn (BatchN (None, None, None, 192) 768 ['block6a_project_conv[0][0]']
ormalization)
block6b_expand_conv (Conv2 (None, None, None, 1152) 221184 ['block6a_project_bn[0][0]']
D)
block6b_expand_bn (BatchNo (None, None, None, 1152) 4608 ['block6b_expand_conv[0][0]']
rmalization)
block6b_expand_activation (None, None, None, 1152) 0 ['block6b_expand_bn[0][0]']
(Activation)
block6b_dwconv2 (Depthwise (None, None, None, 1152) 10368 ['block6b_expand_activation[0]
Conv2D) [0]']
block6b_bn (BatchNormaliza (None, None, None, 1152) 4608 ['block6b_dwconv2[0][0]']
tion)
block6b_activation (Activa (None, None, None, 1152) 0 ['block6b_bn[0][0]']
tion)
block6b_se_squeeze (Global (None, 1152) 0 ['block6b_activation[0][0]']
AveragePooling2D)
block6b_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6b_se_squeeze[0][0]']
e)
block6b_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6b_se_reshape[0][0]']
block6b_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6b_se_reduce[0][0]']
block6b_se_excite (Multipl (None, None, None, 1152) 0 ['block6b_activation[0][0]',
y) 'block6b_se_expand[0][0]']
block6b_project_conv (Conv (None, None, None, 192) 221184 ['block6b_se_excite[0][0]']
2D)
block6b_project_bn (BatchN (None, None, None, 192) 768 ['block6b_project_conv[0][0]']
ormalization)
block6b_drop (Dropout) (None, None, None, 192) 0 ['block6b_project_bn[0][0]']
block6b_add (Add) (None, None, None, 192) 0 ['block6b_drop[0][0]',
'block6a_project_bn[0][0]']
block6c_expand_conv (Conv2 (None, None, None, 1152) 221184 ['block6b_add[0][0]']
D)
block6c_expand_bn (BatchNo (None, None, None, 1152) 4608 ['block6c_expand_conv[0][0]']
rmalization)
block6c_expand_activation (None, None, None, 1152) 0 ['block6c_expand_bn[0][0]']
(Activation)
block6c_dwconv2 (Depthwise (None, None, None, 1152) 10368 ['block6c_expand_activation[0]
Conv2D) [0]']
block6c_bn (BatchNormaliza (None, None, None, 1152) 4608 ['block6c_dwconv2[0][0]']
tion)
block6c_activation (Activa (None, None, None, 1152) 0 ['block6c_bn[0][0]']
tion)
block6c_se_squeeze (Global (None, 1152) 0 ['block6c_activation[0][0]']
AveragePooling2D)
block6c_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6c_se_squeeze[0][0]']
e)
block6c_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6c_se_reshape[0][0]']
block6c_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6c_se_reduce[0][0]']
block6c_se_excite (Multipl (None, None, None, 1152) 0 ['block6c_activation[0][0]',
y) 'block6c_se_expand[0][0]']
block6c_project_conv (Conv (None, None, None, 192) 221184 ['block6c_se_excite[0][0]']
2D)
block6c_project_bn (BatchN (None, None, None, 192) 768 ['block6c_project_conv[0][0]']
ormalization)
block6c_drop (Dropout) (None, None, None, 192) 0 ['block6c_project_bn[0][0]']
block6c_add (Add) (None, None, None, 192) 0 ['block6c_drop[0][0]',
'block6b_add[0][0]']
block6d_expand_conv (Conv2 (None, None, None, 1152) 221184 ['block6c_add[0][0]']
D)
block6d_expand_bn (BatchNo (None, None, None, 1152) 4608 ['block6d_expand_conv[0][0]']
rmalization)
block6d_expand_activation (None, None, None, 1152) 0 ['block6d_expand_bn[0][0]']
(Activation)
block6d_dwconv2 (Depthwise (None, None, None, 1152) 10368 ['block6d_expand_activation[0]
Conv2D) [0]']
block6d_bn (BatchNormaliza (None, None, None, 1152) 4608 ['block6d_dwconv2[0][0]']
tion)
block6d_activation (Activa (None, None, None, 1152) 0 ['block6d_bn[0][0]']
tion)
block6d_se_squeeze (Global (None, 1152) 0 ['block6d_activation[0][0]']
AveragePooling2D)
block6d_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6d_se_squeeze[0][0]']
e)
block6d_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6d_se_reshape[0][0]']
block6d_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6d_se_reduce[0][0]']
block6d_se_excite (Multipl (None, None, None, 1152) 0 ['block6d_activation[0][0]',
y) 'block6d_se_expand[0][0]']
block6d_project_conv (Conv (None, None, None, 192) 221184 ['block6d_se_excite[0][0]']
2D)
block6d_project_bn (BatchN (None, None, None, 192) 768 ['block6d_project_conv[0][0]']
ormalization)
block6d_drop (Dropout) (None, None, None, 192) 0 ['block6d_project_bn[0][0]']
block6d_add (Add) (None, None, None, 192) 0 ['block6d_drop[0][0]',
'block6c_add[0][0]']
block6e_expand_conv (Conv2 (None, None, None, 1152) 221184 ['block6d_add[0][0]']
D)
block6e_expand_bn (BatchNo (None, None, None, 1152) 4608 ['block6e_expand_conv[0][0]']
rmalization)
block6e_expand_activation (None, None, None, 1152) 0 ['block6e_expand_bn[0][0]']
(Activation)
block6e_dwconv2 (Depthwise (None, None, None, 1152) 10368 ['block6e_expand_activation[0]
Conv2D) [0]']
block6e_bn (BatchNormaliza (None, None, None, 1152) 4608 ['block6e_dwconv2[0][0]']
tion)
block6e_activation (Activa (None, None, None, 1152) 0 ['block6e_bn[0][0]']
tion)
block6e_se_squeeze (Global (None, 1152) 0 ['block6e_activation[0][0]']
AveragePooling2D)
block6e_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6e_se_squeeze[0][0]']
e)
block6e_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6e_se_reshape[0][0]']
block6e_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6e_se_reduce[0][0]']
block6e_se_excite (Multipl (None, None, None, 1152) 0 ['block6e_activation[0][0]',
y) 'block6e_se_expand[0][0]']
block6e_project_conv (Conv (None, None, None, 192) 221184 ['block6e_se_excite[0][0]']
2D)
block6e_project_bn (BatchN (None, None, None, 192) 768 ['block6e_project_conv[0][0]']
ormalization)
block6e_drop (Dropout) (None, None, None, 192) 0 ['block6e_project_bn[0][0]']
block6e_add (Add) (None, None, None, 192) 0 ['block6e_drop[0][0]',
'block6d_add[0][0]']
block6f_expand_conv (Conv2 (None, None, None, 1152) 221184 ['block6e_add[0][0]']
D)
block6f_expand_bn (BatchNo (None, None, None, 1152) 4608 ['block6f_expand_conv[0][0]']
rmalization)
block6f_expand_activation (None, None, None, 1152) 0 ['block6f_expand_bn[0][0]']
(Activation)
block6f_dwconv2 (Depthwise (None, None, None, 1152) 10368 ['block6f_expand_activation[0]
Conv2D) [0]']
block6f_bn (BatchNormaliza (None, None, None, 1152) 4608 ['block6f_dwconv2[0][0]']
tion)
block6f_activation (Activa (None, None, None, 1152) 0 ['block6f_bn[0][0]']
tion)
block6f_se_squeeze (Global (None, 1152) 0 ['block6f_activation[0][0]']
AveragePooling2D)
block6f_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6f_se_squeeze[0][0]']
e)
block6f_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6f_se_reshape[0][0]']
block6f_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6f_se_reduce[0][0]']
block6f_se_excite (Multipl (None, None, None, 1152) 0 ['block6f_activation[0][0]',
y) 'block6f_se_expand[0][0]']
block6f_project_conv (Conv (None, None, None, 192) 221184 ['block6f_se_excite[0][0]']
2D)
block6f_project_bn (BatchN (None, None, None, 192) 768 ['block6f_project_conv[0][0]']
ormalization)
block6f_drop (Dropout) (None, None, None, 192) 0 ['block6f_project_bn[0][0]']
block6f_add (Add) (None, None, None, 192) 0 ['block6f_drop[0][0]',
'block6e_add[0][0]']
block6g_expand_conv (Conv2 (None, None, None, 1152) 221184 ['block6f_add[0][0]']
D)
block6g_expand_bn (BatchNo (None, None, None, 1152) 4608 ['block6g_expand_conv[0][0]']
rmalization)
block6g_expand_activation (None, None, None, 1152) 0 ['block6g_expand_bn[0][0]']
(Activation)
block6g_dwconv2 (Depthwise (None, None, None, 1152) 10368 ['block6g_expand_activation[0]
Conv2D) [0]']
block6g_bn (BatchNormaliza (None, None, None, 1152) 4608 ['block6g_dwconv2[0][0]']
tion)
block6g_activation (Activa (None, None, None, 1152) 0 ['block6g_bn[0][0]']
tion)
block6g_se_squeeze (Global (None, 1152) 0 ['block6g_activation[0][0]']
AveragePooling2D)
block6g_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6g_se_squeeze[0][0]']
e)
block6g_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6g_se_reshape[0][0]']
block6g_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6g_se_reduce[0][0]']
block6g_se_excite (Multipl (None, None, None, 1152) 0 ['block6g_activation[0][0]',
y) 'block6g_se_expand[0][0]']
block6g_project_conv (Conv (None, None, None, 192) 221184 ['block6g_se_excite[0][0]']
2D)
block6g_project_bn (BatchN (None, None, None, 192) 768 ['block6g_project_conv[0][0]']
ormalization)
block6g_drop (Dropout) (None, None, None, 192) 0 ['block6g_project_bn[0][0]']
block6g_add (Add) (None, None, None, 192) 0 ['block6g_drop[0][0]',
'block6f_add[0][0]']
block6h_expand_conv (Conv2 (None, None, None, 1152) 221184 ['block6g_add[0][0]']
D)
block6h_expand_bn (BatchNo (None, None, None, 1152) 4608 ['block6h_expand_conv[0][0]']
rmalization)
block6h_expand_activation (None, None, None, 1152) 0 ['block6h_expand_bn[0][0]']
(Activation)
block6h_dwconv2 (Depthwise (None, None, None, 1152) 10368 ['block6h_expand_activation[0]
Conv2D) [0]']
block6h_bn (BatchNormaliza (None, None, None, 1152) 4608 ['block6h_dwconv2[0][0]']
tion)
block6h_activation (Activa (None, None, None, 1152) 0 ['block6h_bn[0][0]']
tion)
block6h_se_squeeze (Global (None, 1152) 0 ['block6h_activation[0][0]']
AveragePooling2D)
block6h_se_reshape (Reshap (None, 1, 1, 1152) 0 ['block6h_se_squeeze[0][0]']
e)
block6h_se_reduce (Conv2D) (None, 1, 1, 48) 55344 ['block6h_se_reshape[0][0]']
block6h_se_expand (Conv2D) (None, 1, 1, 1152) 56448 ['block6h_se_reduce[0][0]']
block6h_se_excite (Multipl (None, None, None, 1152) 0 ['block6h_activation[0][0]',
y) 'block6h_se_expand[0][0]']
block6h_project_conv (Conv (None, None, None, 192) 221184 ['block6h_se_excite[0][0]']
2D)
block6h_project_bn (BatchN (None, None, None, 192) 768 ['block6h_project_conv[0][0]']
ormalization)
block6h_drop (Dropout) (None, None, None, 192) 0 ['block6h_project_bn[0][0]']
block6h_add (Add) (None, None, None, 192) 0 ['block6h_drop[0][0]',
'block6g_add[0][0]']
top_conv (Conv2D) (None, None, None, 1280) 245760 ['block6h_add[0][0]']
top_bn (BatchNormalization (None, None, None, 1280) 5120 ['top_conv[0][0]']
)
top_activation (Activation (None, None, None, 1280) 0 ['top_bn[0][0]']
)
==================================================================================================
Total params: 5919312 (22.58 MB)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 5919312 (22.58 MB)
__________________________________________________________________________________________________
m0.summary()Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_layer (InputLayer) [(None, 224, 224, 3)] 0
efficientnetv2-b0 (Functio (None, None, None, 1280 5919312
nal) )
global_average_pooling_lay (None, 1280) 0
er (GlobalAveragePooling2D
)
output_layer (Dense) (None, 10) 12810
=================================================================
Total params: 5932122 (22.63 MB)
Trainable params: 12810 (50.04 KB)
Non-trainable params: 5919312 (22.58 MB)
_________________________________________________________________
Visualize Loss & Accuracy Curves
plot_loss_curves(m0History)A Layer-in-action: GlobalAveragePooling2D
Here:
- create a random input tensor
- apply the
GlobalAveragePooling2Dkeras layer - see the impact of the layer on the input data
# Define input tensor shape (same number of dimensions as the output of efficientnetv2-b0)
input_shape = (1, 4, 4, 3)
# Create a random tensor
tf.random.set_seed(42)
input_tensor = tf.random.normal(input_shape)
print(f"Random input tensor:\n {input_tensor}\n")
# Pass the random tensor through a global average pooling 2D layer
global_average_pooled_tensor = tf.keras.layers.GlobalAveragePooling2D()(input_tensor)Random input tensor: [[[[ 0.3274685 -0.8426258 0.3194337 ] [-1.4075519 -2.3880599 -1.0392479 ] [-0.5573232 0.539707 1.6994323 ] [ 0.28893656 -1.5066116 -0.26454744]] [[-0.59722406 -1.9171132 -0.62044144] [ 0.8504023 -0.40604794 -3.0258412 ] [ 0.9058464 0.29855987 -0.22561555] [-0.7616443 -1.891714 -0.9384712 ]] [[ 0.77852213 -0.47338897 0.97772694] [ 0.24694404 0.20573747 -0.5256233 ] [ 0.32410017 0.02545409 -0.10638497] [-0.6369475 1.1603122 0.2507359 ]] [[-0.41728497 0.40125778 -1.4145442 ] [-0.59318566 -1.6617213 0.33567193] [ 0.10815629 0.2347968 -0.56668764] [-0.35819843 0.88698626 0.5274477 ]]]]
print(f"2D global average pooled random tensor:\n {global_average_pooled_tensor}\n")2D global average pooled random tensor: [[-0.09368646 -0.45840445 -0.28855976]]
# Check the shapes of the different tensors
print(f"Shape of input tensor: {input_tensor.shape}")
print(f"Shape of 2D global averaged pooled input tensor: {global_average_pooled_tensor.shape}")Shape of input tensor: (1, 4, 4, 3) Shape of 2D global averaged pooled input tensor: (1, 3)
Reduce Mean to get the same
# This is the same as GlobalAveragePooling2D()
tf.reduce_mean(input_tensor, axis=[1, 2]) # average across the middle axes<tf.Tensor: shape=(1, 3), dtype=float32, numpy=array([[-0.09368646, -0.45840445, -0.28855976]], dtype=float32)>
A Layer-in-action: GlobalMaxPool2D
# Define input tensor shape (same number of dimensions as the output of efficientnetv2-b0)
input_shape = (1, 4, 4, 3)
# Create a random tensor
tf.random.set_seed(42)
input_tensor = tf.random.normal(input_shape)
print(f"Random input tensor:\n {input_tensor}\n")
# Pass the random tensor through a global average pooling 2D layer
global_max_pooled_tensor = tf.keras.layers.GlobalMaxPool2D()(input_tensor)Random input tensor: [[[[ 0.3274685 -0.8426258 0.3194337 ] [-1.4075519 -2.3880599 -1.0392479 ] [-0.5573232 0.539707 1.6994323 ] [ 0.28893656 -1.5066116 -0.26454744]] [[-0.59722406 -1.9171132 -0.62044144] [ 0.8504023 -0.40604794 -3.0258412 ] [ 0.9058464 0.29855987 -0.22561555] [-0.7616443 -1.891714 -0.9384712 ]] [[ 0.77852213 -0.47338897 0.97772694] [ 0.24694404 0.20573747 -0.5256233 ] [ 0.32410017 0.02545409 -0.10638497] [-0.6369475 1.1603122 0.2507359 ]] [[-0.41728497 0.40125778 -1.4145442 ] [-0.59318566 -1.6617213 0.33567193] [ 0.10815629 0.2347968 -0.56668764] [-0.35819843 0.88698626 0.5274477 ]]]]
print(f"global_max_pooled_tensor:\n {global_max_pooled_tensor}\n")global_max_pooled_tensor: [[0.9058464 1.1603122 1.6994323]]
# Check the shapes of the different tensors
print(f"Shape of input tensor: {input_tensor.shape}")
print(f"Shape of 2D global averaged pooled input tensor: {global_average_pooled_tensor.shape}")
print(f"Shape of global_max_pooled_tensor: {global_max_pooled_tensor.shape}")Shape of input tensor: (1, 4, 4, 3) Shape of 2D global averaged pooled input tensor: (1, 3) Shape of global_max_pooled_tensor: (1, 3)