In [ ]:
Copied!
%load_ext autoreload
%autoreload 2
%load_ext autoreload
%autoreload 2
Multiclass classification¶
- Predicting more than one mutually exclusive categories:
- Zoomed in photo of any one single food item you want to identify. But there maybe more than one food item in a single image. Depends on how you frame the problem.
- Actual mutually exclusive events such as sunny, overcast, cloudy or day, afternoon, evening, night.
- The distribution of output labels (conditional on $X$ and parameterized by weights (and biases) $W, B$ )is a probability distribution over the labels.
- The loss function is the categorical crossentropy, which measures the probabilistic similarity between the modelled distribution and the ground truth label. (Similar to KL Divergence, but not exactly this. Refer: Machine Learning Mastery: A Gentle Introduction to Cross-Entropy for Machine Learning
In [ ]:
Copied!
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import os
from src.image import ClassicImageDataDirectory, ImageDataset
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import os
from src.image import ClassicImageDataDirectory, ImageDataset
Load the data¶
In [ ]:
Copied!
DATA_DIR = '../data/10_food_classes_all_data/'
TARGET_SIZE = (224, 224)
DATA_DIR = '../data/10_food_classes_all_data/'
TARGET_SIZE = (224, 224)
ClassicImageDataDirectory
¶
In [ ]:
Copied!
imgdir = ClassicImageDataDirectory(DATA_DIR, TARGET_SIZE, dtype=np.uint8)
imgdir = ClassicImageDataDirectory(DATA_DIR, TARGET_SIZE, dtype=np.uint8)
Class Names¶
In [ ]:
Copied!
imgdir.class_names
imgdir.class_names
Out[ ]:
('chicken_curry', 'chicken_wings', 'fried_rice', 'grilled_salmon', 'hamburger', 'ice_cream', 'pizza', 'ramen', 'steak', 'sushi')
Hmmm.. yummy we have so many food items. (Although I am a vegetarian!)
Counts¶
In [ ]:
Copied!
imgdir.labelcountdf
imgdir.labelcountdf
Out[ ]:
label | name | count_train | count_test | |
---|---|---|---|---|
0 | 0 | chicken_curry | 750 | 250 |
1 | 1 | chicken_wings | 750 | 250 |
2 | 2 | fried_rice | 750 | 250 |
3 | 3 | grilled_salmon | 750 | 250 |
4 | 4 | hamburger | 750 | 250 |
5 | 5 | ice_cream | 750 | 250 |
6 | 6 | pizza | 750 | 250 |
7 | 7 | ramen | 750 | 250 |
8 | 8 | steak | 750 | 250 |
9 | 9 | sushi | 750 | 250 |
We have 750 train images and 250 test images of each class
In [ ]:
Copied!
imgdir.plot_labelcounts()
imgdir.plot_labelcounts()
Out[ ]:
<AxesSubplot:title={'center':'Train vs Test label distribution'}, xlabel='name'>
Load a Batch ImageDataset
¶
In [ ]:
Copied!
datagen = imgdir.load(64)
batch = next(datagen)
datagen = imgdir.load(64)
batch = next(datagen)
In [ ]:
Copied!
batch.view_random_images(class_names='all', n_each=2);
batch.view_random_images(class_names='all', n_each=2);
Let's Model it!¶
In [ ]:
Copied!
import tensorflow as tf
from tensorflow.keras import layers, optimizers, regularizers, losses, callbacks
from src.evaluate import KerasMetrics
from src.visualize import plot_keras_model, plot_confusion_matrix, plot_learning_curve
from sklearn import metrics
import tensorflow as tf
from tensorflow.keras import layers, optimizers, regularizers, losses, callbacks
from src.evaluate import KerasMetrics
from src.visualize import plot_keras_model, plot_confusion_matrix, plot_learning_curve
from sklearn import metrics
In [ ]:
Copied!
tfmodels = {}
tfmodels = {}
Model 1: TinyVGG (no data augmentation)¶
Set up initial params¶
In [ ]:
Copied!
from tensorflow.keras.preprocessing.image import ImageDataGenerator
SEED = 42
IMAGE_DIM = (224, 224)
NUM_CHANNELS = 3
INPUT_SHAPE = (*IMAGE_DIM, NUM_CHANNELS)
BATCH_SIZE = 32 # Experimenting for faster and better training
VALIDATION_SPLIT = 0.1 # Experimenting for faster training
SCALING_FACTOR = 1/255.
CLASS_MODE = 'categorical'
N_CLASSES = imgdir.n_classes
# Set the seed
tf.random.set_seed(SEED)
from tensorflow.keras.preprocessing.image import ImageDataGenerator
SEED = 42
IMAGE_DIM = (224, 224)
NUM_CHANNELS = 3
INPUT_SHAPE = (*IMAGE_DIM, NUM_CHANNELS)
BATCH_SIZE = 32 # Experimenting for faster and better training
VALIDATION_SPLIT = 0.1 # Experimenting for faster training
SCALING_FACTOR = 1/255.
CLASS_MODE = 'categorical'
N_CLASSES = imgdir.n_classes
# Set the seed
tf.random.set_seed(SEED)
Preprocess the data¶
In [ ]:
Copied!
train_datagen = ImageDataGenerator(rescale=SCALING_FACTOR, validation_split=VALIDATION_SPLIT)
test_datagen = ImageDataGenerator(rescale=SCALING_FACTOR)
train_datagen = ImageDataGenerator(rescale=SCALING_FACTOR, validation_split=VALIDATION_SPLIT)
test_datagen = ImageDataGenerator(rescale=SCALING_FACTOR)
Setup the train and test directories¶
In [ ]:
Copied!
train_dir = imgdir.train['dir']
test_dir = imgdir.test['dir']
train_dir = imgdir.train['dir']
test_dir = imgdir.test['dir']
Import the data from the directories and turn it into batches¶
In [ ]:
Copied!
train_data = train_datagen.flow_from_directory(directory=train_dir, target_size=TARGET_SIZE,
class_mode=CLASS_MODE, batch_size=BATCH_SIZE,
seed=SEED, subset='training')
validation_data = train_datagen.flow_from_directory(directory=train_dir, target_size=TARGET_SIZE,
class_mode=CLASS_MODE, batch_size=BATCH_SIZE,
seed=SEED, subset='validation')
test_data = test_datagen.flow_from_directory(directory=test_dir, target_size=TARGET_SIZE,
class_mode=CLASS_MODE, batch_size=BATCH_SIZE,
shuffle=False)
train_data = train_datagen.flow_from_directory(directory=train_dir, target_size=TARGET_SIZE,
class_mode=CLASS_MODE, batch_size=BATCH_SIZE,
seed=SEED, subset='training')
validation_data = train_datagen.flow_from_directory(directory=train_dir, target_size=TARGET_SIZE,
class_mode=CLASS_MODE, batch_size=BATCH_SIZE,
seed=SEED, subset='validation')
test_data = test_datagen.flow_from_directory(directory=test_dir, target_size=TARGET_SIZE,
class_mode=CLASS_MODE, batch_size=BATCH_SIZE,
shuffle=False)
Found 6750 images belonging to 10 classes. Found 750 images belonging to 10 classes. Found 2500 images belonging to 10 classes.
Create the model¶
In [ ]:
Copied!
# Create the CNN Model
model = tf.keras.models.Sequential([
layers.Input(shape=INPUT_SHAPE),
layers.Conv2D(filters=10, kernel_size=3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(pool_size=2, padding='same'),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2, padding='same'),
layers.Flatten(),
layers.Dense(N_CLASSES, activation='softmax')
], name='TinyVGG')
# Compile the model
model.compile(loss=losses.CategoricalCrossentropy(), optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
# Create the CNN Model
model = tf.keras.models.Sequential([
layers.Input(shape=INPUT_SHAPE),
layers.Conv2D(filters=10, kernel_size=3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(pool_size=2, padding='same'),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2, padding='same'),
layers.Flatten(),
layers.Dense(N_CLASSES, activation='softmax')
], name='TinyVGG')
# Compile the model
model.compile(loss=losses.CategoricalCrossentropy(), optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
Model: "TinyVGG" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_16 (Conv2D) (None, 222, 222, 10) 280 _________________________________________________________________ conv2d_17 (Conv2D) (None, 220, 220, 10) 910 _________________________________________________________________ max_pooling2d_8 (MaxPooling2 (None, 110, 110, 10) 0 _________________________________________________________________ conv2d_18 (Conv2D) (None, 108, 108, 10) 910 _________________________________________________________________ conv2d_19 (Conv2D) (None, 106, 106, 10) 910 _________________________________________________________________ max_pooling2d_9 (MaxPooling2 (None, 53, 53, 10) 0 _________________________________________________________________ flatten_4 (Flatten) (None, 28090) 0 _________________________________________________________________ dense_4 (Dense) (None, 10) 280910 ================================================================= Total params: 283,920 Trainable params: 283,920 Non-trainable params: 0 _________________________________________________________________
In [ ]:
Copied!
plot_keras_model(model, show_shapes=True)
plot_keras_model(model, show_shapes=True)
Out[ ]:
Fit the model¶
In [ ]:
Copied!
history = model.fit(train_data, steps_per_epoch=len(train_data),
validation_data=validation_data, validation_steps=len(validation_data),
epochs=10)
tfmodels[model.name] = model
history = model.fit(train_data, steps_per_epoch=len(train_data),
validation_data=validation_data, validation_steps=len(validation_data),
epochs=10)
tfmodels[model.name] = model
Epoch 1/10 106/106 [==============================] - 60s 568ms/step - loss: 1.9787 - f1: 0.0655 - accuracy: 0.2953 - val_loss: 1.9737 - val_f1: 0.0752 - val_accuracy: 0.2973 Epoch 2/10 106/106 [==============================] - 57s 540ms/step - loss: 1.7643 - f1: 0.1656 - accuracy: 0.3924 - val_loss: 1.9382 - val_f1: 0.1759 - val_accuracy: 0.3240 Epoch 3/10 106/106 [==============================] - 63s 589ms/step - loss: 1.4475 - f1: 0.3790 - accuracy: 0.5124 - val_loss: 2.0325 - val_f1: 0.1962 - val_accuracy: 0.3293 Epoch 4/10 106/106 [==============================] - 54s 508ms/step - loss: 1.0060 - f1: 0.6230 - accuracy: 0.6738 - val_loss: 2.3596 - val_f1: 0.2399 - val_accuracy: 0.2920 Epoch 5/10 106/106 [==============================] - 48s 451ms/step - loss: 0.5226 - f1: 0.8285 - accuracy: 0.8363 - val_loss: 3.3222 - val_f1: 0.2594 - val_accuracy: 0.2720 Epoch 6/10 106/106 [==============================] - 37s 351ms/step - loss: 0.2057 - f1: 0.9388 - accuracy: 0.9404 - val_loss: 3.8422 - val_f1: 0.2597 - val_accuracy: 0.2680 Epoch 7/10 106/106 [==============================] - 35s 329ms/step - loss: 0.0800 - f1: 0.9802 - accuracy: 0.9816 - val_loss: 4.9968 - val_f1: 0.2715 - val_accuracy: 0.2693 Epoch 8/10 106/106 [==============================] - 34s 319ms/step - loss: 0.0420 - f1: 0.9893 - accuracy: 0.9898 - val_loss: 5.6086 - val_f1: 0.2761 - val_accuracy: 0.2840 Epoch 9/10 106/106 [==============================] - 36s 340ms/step - loss: 0.0124 - f1: 0.9988 - accuracy: 0.9988 - val_loss: 6.2481 - val_f1: 0.2806 - val_accuracy: 0.2840 Epoch 10/10 106/106 [==============================] - 35s 333ms/step - loss: 0.0045 - f1: 0.9999 - accuracy: 0.9999 - val_loss: 6.5114 - val_f1: 0.2805 - val_accuracy: 0.2840
Learning Curve¶
In [ ]:
Copied!
plot_learning_curve(model, extra_metric='f1');
plot_learning_curve(model, extra_metric='f1');
- The learning rate seems fine (atleast for the training set, where we see the loss gradually decreasing to zero)
- However the problem seems to be that the model is too complex and is highly overfitted! We should try:
- Data augmentation (definitely)
- Regularization
- A simpler model (or a different architecture)
Prediction evaluation¶
In [ ]:
Copied!
y_test_pred_probs = model.predict(test_data)
y_test_preds = y_test_pred_probs.argmax(1)
y_test_pred_probs = model.predict(test_data)
y_test_preds = y_test_pred_probs.argmax(1)
Classification report¶
In [ ]:
Copied!
print(metrics.classification_report(test_data.labels, y_test_preds, target_names=test_data.class_indices))
print(metrics.classification_report(test_data.labels, y_test_preds, target_names=test_data.class_indices))
precision recall f1-score support chicken_curry 0.30 0.28 0.29 250 chicken_wings 0.29 0.28 0.29 250 fried_rice 0.34 0.40 0.37 250 grilled_salmon 0.24 0.26 0.25 250 hamburger 0.20 0.21 0.21 250 ice_cream 0.36 0.40 0.38 250 pizza 0.32 0.22 0.26 250 ramen 0.38 0.36 0.37 250 steak 0.34 0.37 0.36 250 sushi 0.24 0.22 0.23 250 accuracy 0.30 2500 macro avg 0.30 0.30 0.30 2500 weighted avg 0.30 0.30 0.30 2500
Confusion Matrix¶
In [ ]:
Copied!
plot_confusion_matrix(test_data.labels, y_test_preds, classes=imgdir.class_names, figsize=(14, 14), text_size=10)
plot_confusion_matrix(test_data.labels, y_test_preds, classes=imgdir.class_names, figsize=(14, 14), text_size=10)
Out[ ]:
<matplotlib.image.AxesImage at 0x1f8a8cb4188>
Model 2: TinyVGG (with data augmentation)¶
Set initial params¶
In [ ]:
Copied!
SEED = 42
IMAGE_DIM = (224, 224)
NUM_CHANNELS = 3
INPUT_SHAPE = (*IMAGE_DIM, NUM_CHANNELS)
CLASS_MODE = 'categorical'
N_CLASSES = imgdir.n_classes
BATCH_SIZE = 32
VALIDATION_SPLIT = 0.1
# Set seed
tf.random.set_seed(42)
SEED = 42
IMAGE_DIM = (224, 224)
NUM_CHANNELS = 3
INPUT_SHAPE = (*IMAGE_DIM, NUM_CHANNELS)
CLASS_MODE = 'categorical'
N_CLASSES = imgdir.n_classes
BATCH_SIZE = 32
VALIDATION_SPLIT = 0.1
# Set seed
tf.random.set_seed(42)
Preprocess the data (Data Augmentation)¶
In [ ]:
Copied!
train_datagen = ImageDataGenerator(
rescale=1/255.,
rotation_range=0.2,
shear_range=0.2,
zoom_range=0.2,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
validation_split=VALIDATION_SPLIT)
test_datagen = ImageDataGenerator(rescale=1/255.)
train_datagen = ImageDataGenerator(
rescale=1/255.,
rotation_range=0.2,
shear_range=0.2,
zoom_range=0.2,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
validation_split=VALIDATION_SPLIT)
test_datagen = ImageDataGenerator(rescale=1/255.)
Setup the train and test directories¶
In [ ]:
Copied!
train_dir = imgdir.train['dir']
test_dir = imgdir.test['dir']
train_dir = imgdir.train['dir']
test_dir = imgdir.test['dir']
Import the data from the directories¶
In [ ]:
Copied!
train_data = train_datagen.flow_from_directory(directory=train_dir, batch_size=BATCH_SIZE,
target_size=IMAGE_DIM, class_mode=CLASS_MODE, seed=SEED, subset='training')
validation_data = train_datagen.flow_from_directory(directory=train_dir, batch_size=BATCH_SIZE,
target_size=IMAGE_DIM,class_mode=CLASS_MODE, seed=SEED, subset='validation')
test_data = test_datagen.flow_from_directory(directory=test_dir, batch_size=BATCH_SIZE, target_size=IMAGE_DIM, class_mode=CLASS_MODE, shuffle=False)
train_data = train_datagen.flow_from_directory(directory=train_dir, batch_size=BATCH_SIZE,
target_size=IMAGE_DIM, class_mode=CLASS_MODE, seed=SEED, subset='training')
validation_data = train_datagen.flow_from_directory(directory=train_dir, batch_size=BATCH_SIZE,
target_size=IMAGE_DIM,class_mode=CLASS_MODE, seed=SEED, subset='validation')
test_data = test_datagen.flow_from_directory(directory=test_dir, batch_size=BATCH_SIZE, target_size=IMAGE_DIM, class_mode=CLASS_MODE, shuffle=False)
Found 6750 images belonging to 10 classes. Found 750 images belonging to 10 classes. Found 2500 images belonging to 10 classes.
Create the model¶
In [ ]:
Copied!
# Create
model = tf.keras.models.Sequential([
layers.Input(shape=INPUT_SHAPE),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Flatten(),
layers.Dense(N_CLASSES, activation='softmax')
], name='TinyVGG-data-augment')
# Compile
model.compile(loss=losses.categorical_crossentropy, optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
# Create
model = tf.keras.models.Sequential([
layers.Input(shape=INPUT_SHAPE),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Flatten(),
layers.Dense(N_CLASSES, activation='softmax')
], name='TinyVGG-data-augment')
# Compile
model.compile(loss=losses.categorical_crossentropy, optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
Model: "TinyVGG-data-augment" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_66 (Conv2D) (None, 222, 222, 10) 280 _________________________________________________________________ conv2d_67 (Conv2D) (None, 220, 220, 10) 910 _________________________________________________________________ max_pooling2d_33 (MaxPooling (None, 110, 110, 10) 0 _________________________________________________________________ conv2d_68 (Conv2D) (None, 108, 108, 10) 910 _________________________________________________________________ conv2d_69 (Conv2D) (None, 106, 106, 10) 910 _________________________________________________________________ max_pooling2d_34 (MaxPooling (None, 53, 53, 10) 0 _________________________________________________________________ flatten_14 (Flatten) (None, 28090) 0 _________________________________________________________________ dense_17 (Dense) (None, 10) 280910 ================================================================= Total params: 283,920 Trainable params: 283,920 Non-trainable params: 0 _________________________________________________________________
Fit the model¶
In [ ]:
Copied!
history = model.fit(train_data, steps_per_epoch=len(train_data),
validation_data=validation_data, validation_steps=len(validation_data),
epochs=10)
history = model.fit(train_data, steps_per_epoch=len(train_data),
validation_data=validation_data, validation_steps=len(validation_data),
epochs=10)
Epoch 1/10 211/211 [==============================] - 99s 464ms/step - loss: 2.2490 - f1: 0.0034 - accuracy: 0.1513 - val_loss: 2.1087 - val_f1: 0.0370 - val_accuracy: 0.2387 Epoch 2/10 211/211 [==============================] - 101s 478ms/step - loss: 2.0742 - f1: 0.0256 - accuracy: 0.2657 - val_loss: 2.0828 - val_f1: 0.0579 - val_accuracy: 0.2613 Epoch 3/10 211/211 [==============================] - 105s 497ms/step - loss: 2.0405 - f1: 0.0513 - accuracy: 0.2753 - val_loss: 2.0306 - val_f1: 0.0856 - val_accuracy: 0.3013 Epoch 4/10 211/211 [==============================] - 102s 485ms/step - loss: 1.9816 - f1: 0.0619 - accuracy: 0.3157 - val_loss: 1.9420 - val_f1: 0.1401 - val_accuracy: 0.3200 Epoch 5/10 211/211 [==============================] - 108s 512ms/step - loss: 1.9348 - f1: 0.0991 - accuracy: 0.3241 - val_loss: 1.9560 - val_f1: 0.1244 - val_accuracy: 0.3253 Epoch 6/10 211/211 [==============================] - 131s 621ms/step - loss: 1.9034 - f1: 0.1140 - accuracy: 0.3370 - val_loss: 1.8775 - val_f1: 0.1087 - val_accuracy: 0.3507 Epoch 7/10 211/211 [==============================] - 124s 590ms/step - loss: 1.8768 - f1: 0.1364 - accuracy: 0.3576 - val_loss: 1.8609 - val_f1: 0.1565 - val_accuracy: 0.3627 Epoch 8/10 211/211 [==============================] - 129s 609ms/step - loss: 1.8404 - f1: 0.1679 - accuracy: 0.3711 - val_loss: 1.8362 - val_f1: 0.1728 - val_accuracy: 0.3693 Epoch 9/10 211/211 [==============================] - 119s 565ms/step - loss: 1.8466 - f1: 0.1551 - accuracy: 0.3651 - val_loss: 1.8442 - val_f1: 0.1778 - val_accuracy: 0.3640 Epoch 10/10 211/211 [==============================] - 128s 604ms/step - loss: 1.8421 - f1: 0.1674 - accuracy: 0.3683 - val_loss: 1.7901 - val_f1: 0.1480 - val_accuracy: 0.4147
In [ ]:
Copied!
tfmodels[model.name] = model
tfmodels[model.name] = model
Learning Curve¶
In [ ]:
Copied!
plot_learning_curve(model, extra_metric='f1')
plot_learning_curve(model, extra_metric='f1')
Out[ ]:
(<Figure size 864x216 with 2 Axes>, array([<AxesSubplot:title={'center':'loss'}>, <AxesSubplot:title={'center':'f1'}>], dtype=object))
The learning rate seems to be too low for the model to learn effectively. We need to increase the learning rate (or find a better learning rate). There is hope that we might not overfit as badly as we did with no data augmentation model.
Prediction Evaluation¶
In [ ]:
Copied!
y_test_pred_probs = model.predict(test_data)
y_test_preds = y_test_pred_probs.argmax(axis=1)
y_test_pred_probs = model.predict(test_data)
y_test_preds = y_test_pred_probs.argmax(axis=1)
Classification report¶
In [ ]:
Copied!
print(metrics.classification_report(test_data.labels, y_test_preds, target_names=test_data.class_indices))
print(metrics.classification_report(test_data.labels, y_test_preds, target_names=test_data.class_indices))
precision recall f1-score support chicken_curry 0.57 0.20 0.30 250 chicken_wings 0.43 0.57 0.49 250 fried_rice 0.67 0.44 0.53 250 grilled_salmon 0.43 0.31 0.36 250 hamburger 0.34 0.34 0.34 250 ice_cream 0.59 0.34 0.43 250 pizza 0.53 0.51 0.52 250 ramen 0.37 0.51 0.43 250 steak 0.34 0.82 0.48 250 sushi 0.53 0.32 0.40 250 accuracy 0.44 2500 macro avg 0.48 0.44 0.43 2500 weighted avg 0.48 0.44 0.43 2500
Find a better learning rate¶
LearningRateScheduler
¶
In [ ]:
Copied!
NUM_EPOCHS = 10
lr_epochs = np.logspace(-3, -1, NUM_EPOCHS)
lr_scheduler = callbacks.LearningRateScheduler(lambda epoch: lr_epochs[epoch])
NUM_EPOCHS = 10
lr_epochs = np.logspace(-3, -1, NUM_EPOCHS)
lr_scheduler = callbacks.LearningRateScheduler(lambda epoch: lr_epochs[epoch])
Clone TinyVGG model¶
In [ ]:
Copied!
# Clone
model = tf.keras.models.clone_model(tfmodels['TinyVGG'])
model._name = 'TinyVGG-data-augment-bestlr'
# Recompile
model.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
# Clone
model = tf.keras.models.clone_model(tfmodels['TinyVGG'])
model._name = 'TinyVGG-data-augment-bestlr'
# Recompile
model.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
Model: "TinyVGG-data-augment-bestlr" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_16 (Conv2D) (None, 222, 222, 10) 280 _________________________________________________________________ conv2d_17 (Conv2D) (None, 220, 220, 10) 910 _________________________________________________________________ max_pooling2d_8 (MaxPooling2 (None, 110, 110, 10) 0 _________________________________________________________________ conv2d_18 (Conv2D) (None, 108, 108, 10) 910 _________________________________________________________________ conv2d_19 (Conv2D) (None, 106, 106, 10) 910 _________________________________________________________________ max_pooling2d_9 (MaxPooling2 (None, 53, 53, 10) 0 _________________________________________________________________ flatten_4 (Flatten) (None, 28090) 0 _________________________________________________________________ dense_4 (Dense) (None, 10) 280910 ================================================================= Total params: 283,920 Trainable params: 283,920 Non-trainable params: 0 _________________________________________________________________
Fit the model¶
In [ ]:
Copied!
DATA_PERCENT = 50
train_steps = int(len(train_data)*DATA_PERCENT/100)
DATA_PERCENT = 50
train_steps = int(len(train_data)*DATA_PERCENT/100)
In [ ]:
Copied!
history = model.fit(train_data, steps_per_epoch=train_steps, epochs=NUM_EPOCHS, callbacks=[lr_scheduler])
history = model.fit(train_data, steps_per_epoch=train_steps, epochs=NUM_EPOCHS, callbacks=[lr_scheduler])
Epoch 1/10 105/105 [==============================] - 115s 1s/step - loss: 2.2041 - f1: 5.7720e-04 - accuracy: 0.1778 Epoch 2/10 105/105 [==============================] - 99s 942ms/step - loss: 2.1470 - f1: 0.0089 - accuracy: 0.2277 Epoch 3/10 105/105 [==============================] - 91s 869ms/step - loss: 2.1725 - f1: 0.0132 - accuracy: 0.2129 Epoch 4/10 105/105 [==============================] - 89s 845ms/step - loss: 2.1748 - f1: 0.0134 - accuracy: 0.2070 Epoch 5/10 105/105 [==============================] - 88s 836ms/step - loss: 2.2220 - f1: 0.0091 - accuracy: 0.1825 Epoch 6/10 105/105 [==============================] - 87s 826ms/step - loss: 2.2487 - f1: 0.0057 - accuracy: 0.1533 Epoch 7/10 105/105 [==============================] - 85s 811ms/step - loss: 2.2662 - f1: 5.7720e-04 - accuracy: 0.1474 Epoch 8/10 105/105 [==============================] - 86s 818ms/step - loss: 2.3113 - f1: 0.0010 - accuracy: 0.0992 Epoch 9/10 105/105 [==============================] - 86s 815ms/step - loss: 2.3095 - f1: 0.0000e+00 - accuracy: 0.0961 Epoch 10/10 105/105 [==============================] - 83s 791ms/step - loss: 2.3149 - f1: 0.0000e+00 - accuracy: 0.0980
Learning Curve vs Loss¶
In [ ]:
Copied!
history_df = pd.DataFrame(history.history)
history_df.head()
history_df = pd.DataFrame(history.history)
history_df.head()
Out[ ]:
loss | f1 | accuracy | lr | |
---|---|---|---|---|
0 | 2.204108 | 0.000577 | 0.177784 | 0.001000 |
1 | 2.147049 | 0.008896 | 0.227679 | 0.001668 |
2 | 2.172542 | 0.013227 | 0.212924 | 0.002783 |
3 | 2.174774 | 0.013412 | 0.206968 | 0.004642 |
4 | 2.222023 | 0.009086 | 0.182549 | 0.007743 |
In [ ]:
Copied!
plt.semilogx(history_df['lr'], history_df['loss'])
plt.xlabel('lr')
plt.ylabel('loss')
plt.title('Learning Rate vs Loss', fontdict=dict(weight='bold', size=20))
plt.semilogx(history_df['lr'], history_df['loss'])
plt.xlabel('lr')
plt.ylabel('loss')
plt.title('Learning Rate vs Loss', fontdict=dict(weight='bold', size=20))
Out[ ]:
Text(0.5, 1.0, 'Learning Rate vs Loss')
I can't find a reason why this would happen! The best learning rate seems to be the default only. Maybe we should try making the model more complex
In [ ]:
Copied!
best_lr = 1e-3
best_lr = 1e-3
Model 3: TinyVGG - Extra Conv - Dense¶
Create the model¶
In [ ]:
Copied!
model.summary()
model.summary()
Model: "TinyVGG-data-augment-bestlr" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_16 (Conv2D) (None, 222, 222, 10) 280 _________________________________________________________________ conv2d_17 (Conv2D) (None, 220, 220, 10) 910 _________________________________________________________________ max_pooling2d_8 (MaxPooling2 (None, 110, 110, 10) 0 _________________________________________________________________ conv2d_18 (Conv2D) (None, 108, 108, 10) 910 _________________________________________________________________ conv2d_19 (Conv2D) (None, 106, 106, 10) 910 _________________________________________________________________ max_pooling2d_9 (MaxPooling2 (None, 53, 53, 10) 0 _________________________________________________________________ flatten_4 (Flatten) (None, 28090) 0 _________________________________________________________________ dense_4 (Dense) (None, 10) 280910 ================================================================= Total params: 283,920 Trainable params: 283,920 Non-trainable params: 0 _________________________________________________________________
In [ ]:
Copied!
model = tf.keras.models.Sequential([
layers.Input(shape=INPUT_SHAPE),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Flatten(),
layers.Dense(512, activation='relu'),
layers.Dense(N_CLASSES, activation='softmax')
], name='TinyVGG-Extra-Conv-Dense')
# Compile
model.compile(loss=losses.categorical_crossentropy, optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
model = tf.keras.models.Sequential([
layers.Input(shape=INPUT_SHAPE),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.MaxPool2D(2),
layers.Flatten(),
layers.Dense(512, activation='relu'),
layers.Dense(N_CLASSES, activation='softmax')
], name='TinyVGG-Extra-Conv-Dense')
# Compile
model.compile(loss=losses.categorical_crossentropy, optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
Model: "TinyVGG-Extra-Conv-Dense" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_76 (Conv2D) (None, 222, 222, 10) 280 _________________________________________________________________ conv2d_77 (Conv2D) (None, 220, 220, 10) 910 _________________________________________________________________ max_pooling2d_38 (MaxPooling (None, 110, 110, 10) 0 _________________________________________________________________ conv2d_78 (Conv2D) (None, 108, 108, 10) 910 _________________________________________________________________ conv2d_79 (Conv2D) (None, 106, 106, 10) 910 _________________________________________________________________ max_pooling2d_39 (MaxPooling (None, 53, 53, 10) 0 _________________________________________________________________ conv2d_80 (Conv2D) (None, 51, 51, 10) 910 _________________________________________________________________ conv2d_81 (Conv2D) (None, 49, 49, 10) 910 _________________________________________________________________ max_pooling2d_40 (MaxPooling (None, 24, 24, 10) 0 _________________________________________________________________ flatten_16 (Flatten) (None, 5760) 0 _________________________________________________________________ dense_20 (Dense) (None, 512) 2949632 _________________________________________________________________ dense_21 (Dense) (None, 10) 5130 ================================================================= Total params: 2,959,592 Trainable params: 2,959,592 Non-trainable params: 0 _________________________________________________________________
Refit!¶
In [ ]:
Copied!
history = model.fit(train_data, steps_per_epoch=len(train_data),
validation_data=validation_data, validation_steps=len(validation_data), epochs=10)
tfmodels[model.name] = model
history = model.fit(train_data, steps_per_epoch=len(train_data),
validation_data=validation_data, validation_steps=len(validation_data), epochs=10)
tfmodels[model.name] = model
Epoch 1/10 211/211 [==============================] - 182s 857ms/step - loss: 2.2921 - f1: 0.0010 - accuracy: 0.1160 - val_loss: 2.1761 - val_f1: 0.0000e+00 - val_accuracy: 0.1920 Epoch 2/10 211/211 [==============================] - 134s 637ms/step - loss: 2.1625 - f1: 0.0013 - accuracy: 0.1867 - val_loss: 2.0846 - val_f1: 0.0310 - val_accuracy: 0.2467 Epoch 3/10 211/211 [==============================] - 158s 748ms/step - loss: 2.0933 - f1: 0.0184 - accuracy: 0.2346 - val_loss: 2.0233 - val_f1: 0.0317 - val_accuracy: 0.2667 Epoch 4/10 211/211 [==============================] - 149s 704ms/step - loss: 2.0294 - f1: 0.0418 - accuracy: 0.2716 - val_loss: 1.9927 - val_f1: 0.0690 - val_accuracy: 0.3000 Epoch 5/10 211/211 [==============================] - 150s 711ms/step - loss: 1.9793 - f1: 0.0515 - accuracy: 0.2893 - val_loss: 1.9570 - val_f1: 0.0556 - val_accuracy: 0.3147 Epoch 6/10 211/211 [==============================] - 155s 735ms/step - loss: 1.9650 - f1: 0.0710 - accuracy: 0.3044 - val_loss: 1.9370 - val_f1: 0.1037 - val_accuracy: 0.3187 Epoch 7/10 211/211 [==============================] - 157s 743ms/step - loss: 1.9412 - f1: 0.0900 - accuracy: 0.3162 - val_loss: 1.8704 - val_f1: 0.1486 - val_accuracy: 0.3640 Epoch 8/10 211/211 [==============================] - 151s 715ms/step - loss: 1.8863 - f1: 0.1329 - accuracy: 0.3422 - val_loss: 1.8926 - val_f1: 0.1169 - val_accuracy: 0.3427 Epoch 9/10 211/211 [==============================] - 146s 693ms/step - loss: 1.8476 - f1: 0.1552 - accuracy: 0.3590 - val_loss: 1.8562 - val_f1: 0.1414 - val_accuracy: 0.3533 Epoch 10/10 211/211 [==============================] - 136s 646ms/step - loss: 1.8016 - f1: 0.1806 - accuracy: 0.3759 - val_loss: 1.8944 - val_f1: 0.1482 - val_accuracy: 0.3213
Learning Curve¶
In [ ]:
Copied!
plot_learning_curve(model, extra_metric='f1')
plot_learning_curve(model, extra_metric='f1')
Out[ ]:
(<Figure size 864x216 with 2 Axes>, array([<AxesSubplot:title={'center':'loss'}>, <AxesSubplot:title={'center':'f1'}>], dtype=object))
This is still very slow to train! Should we try BatchNormalization
after every Convolution layer?
Prediction Evaluation¶
In [ ]:
Copied!
y_test_pred_probs = model.predict(test_data)
y_test_preds = y_test_pred_probs.argmax(axis=1)
y_test_pred_probs = model.predict(test_data)
y_test_preds = y_test_pred_probs.argmax(axis=1)
Classification report¶
In [ ]:
Copied!
print(metrics.classification_report(test_data.labels, y_test_preds, target_names=test_data.class_indices))
print(metrics.classification_report(test_data.labels, y_test_preds, target_names=test_data.class_indices))
precision recall f1-score support chicken_curry 0.43 0.31 0.36 250 chicken_wings 0.37 0.78 0.50 250 fried_rice 0.68 0.35 0.46 250 grilled_salmon 0.33 0.36 0.35 250 hamburger 0.27 0.17 0.21 250 ice_cream 0.47 0.44 0.46 250 pizza 0.72 0.25 0.37 250 ramen 0.32 0.54 0.40 250 steak 0.43 0.53 0.47 250 sushi 0.39 0.27 0.32 250 accuracy 0.40 2500 macro avg 0.44 0.40 0.39 2500 weighted avg 0.44 0.40 0.39 2500
Model 4: TinyVGG - Extra Conv - Dense - BatchNorm¶
Create the model¶
In [ ]:
Copied!
model = tf.keras.models.Sequential([
layers.Input(shape=INPUT_SHAPE),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.BatchNormalization(),
layers.MaxPool2D(),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.BatchNormalization(),
layers.MaxPool2D(),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.BatchNormalization(),
layers.MaxPool2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(N_CLASSES, activation='softmax')
], name='TinyVGG-Extra-Conv-BatchNorm-Dense')
# Compile
model.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(learning_rate=0.001), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
model = tf.keras.models.Sequential([
layers.Input(shape=INPUT_SHAPE),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.BatchNormalization(),
layers.MaxPool2D(),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.BatchNormalization(),
layers.MaxPool2D(),
layers.Conv2D(10, 3, activation='relu'),
layers.Conv2D(10, 3, activation='relu'),
layers.BatchNormalization(),
layers.MaxPool2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(N_CLASSES, activation='softmax')
], name='TinyVGG-Extra-Conv-BatchNorm-Dense')
# Compile
model.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(learning_rate=0.001), metrics=[KerasMetrics.f1, 'accuracy'])
# Summary
model.summary()
Model: "TinyVGG-Extra-Conv-BatchNorm-Dense" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_94 (Conv2D) (None, 222, 222, 10) 280 _________________________________________________________________ conv2d_95 (Conv2D) (None, 220, 220, 10) 910 _________________________________________________________________ batch_normalization_6 (Batch (None, 220, 220, 10) 40 _________________________________________________________________ max_pooling2d_47 (MaxPooling (None, 110, 110, 10) 0 _________________________________________________________________ conv2d_96 (Conv2D) (None, 108, 108, 10) 910 _________________________________________________________________ conv2d_97 (Conv2D) (None, 106, 106, 10) 910 _________________________________________________________________ batch_normalization_7 (Batch (None, 106, 106, 10) 40 _________________________________________________________________ max_pooling2d_48 (MaxPooling (None, 53, 53, 10) 0 _________________________________________________________________ conv2d_98 (Conv2D) (None, 51, 51, 10) 910 _________________________________________________________________ conv2d_99 (Conv2D) (None, 49, 49, 10) 910 _________________________________________________________________ batch_normalization_8 (Batch (None, 49, 49, 10) 40 _________________________________________________________________ max_pooling2d_49 (MaxPooling (None, 24, 24, 10) 0 _________________________________________________________________ flatten_19 (Flatten) (None, 5760) 0 _________________________________________________________________ dense_26 (Dense) (None, 128) 737408 _________________________________________________________________ dense_27 (Dense) (None, 10) 1290 ================================================================= Total params: 743,648 Trainable params: 743,588 Non-trainable params: 60 _________________________________________________________________
Fit the model¶
In [ ]:
Copied!
history = model.fit(train_data, steps_per_epoch=len(train_data),
validation_data=validation_data, validation_steps=len(validation_data), epochs=10)
tfmodels[model.name] = model
history = model.fit(train_data, steps_per_epoch=len(train_data),
validation_data=validation_data, validation_steps=len(validation_data), epochs=10)
tfmodels[model.name] = model
Epoch 1/10 211/211 [==============================] - 135s 638ms/step - loss: 2.7445 - f1: 0.0237 - accuracy: 0.1452 - val_loss: 9.4667 - val_f1: 0.0955 - val_accuracy: 0.0960 Epoch 2/10 211/211 [==============================] - 176s 833ms/step - loss: 2.1898 - f1: 0.0387 - accuracy: 0.2086 - val_loss: 3.4010 - val_f1: 0.1493 - val_accuracy: 0.1747 Epoch 3/10 211/211 [==============================] - 181s 856ms/step - loss: 2.0838 - f1: 0.0734 - accuracy: 0.2494 - val_loss: 2.0720 - val_f1: 0.0613 - val_accuracy: 0.2533 Epoch 4/10 211/211 [==============================] - 133s 629ms/step - loss: 2.0246 - f1: 0.0890 - accuracy: 0.2737 - val_loss: 2.0299 - val_f1: 0.1167 - val_accuracy: 0.2667 Epoch 5/10 211/211 [==============================] - 133s 628ms/step - loss: 1.9989 - f1: 0.1017 - accuracy: 0.2988 - val_loss: 2.0132 - val_f1: 0.1354 - val_accuracy: 0.2853 Epoch 6/10 211/211 [==============================] - 132s 625ms/step - loss: 1.9416 - f1: 0.1227 - accuracy: 0.3198 - val_loss: 2.1736 - val_f1: 0.1668 - val_accuracy: 0.2693 Epoch 7/10 211/211 [==============================] - 134s 637ms/step - loss: 1.9183 - f1: 0.1556 - accuracy: 0.3258 - val_loss: 1.9899 - val_f1: 0.1700 - val_accuracy: 0.2960 Epoch 8/10 211/211 [==============================] - 137s 651ms/step - loss: 1.8719 - f1: 0.1730 - accuracy: 0.3556 - val_loss: 1.9032 - val_f1: 0.1996 - val_accuracy: 0.3427 Epoch 9/10 211/211 [==============================] - 137s 649ms/step - loss: 1.8352 - f1: 0.2132 - accuracy: 0.3668 - val_loss: 2.2180 - val_f1: 0.1972 - val_accuracy: 0.2733 Epoch 10/10 211/211 [==============================] - 136s 644ms/step - loss: 1.8304 - f1: 0.2167 - accuracy: 0.3632 - val_loss: 2.0264 - val_f1: 0.1728 - val_accuracy: 0.3000
Learning Curve¶
In [ ]:
Copied!
plot_learning_curve(history, extra_metric='f1');
plot_learning_curve(history, extra_metric='f1');
Looks like we need reduce our Learning Rate when we plateau on the validation loss.
Prediction Evaluation¶
In [ ]:
Copied!
y_test_pred_probs = model.predict(test_data)
y_test_preds = y_test_pred_probs.argmax(axis=1)
y_test_pred_probs = model.predict(test_data)
y_test_preds = y_test_pred_probs.argmax(axis=1)
Classification report¶
In [ ]:
Copied!
print(metrics.classification_report(test_data.labels, y_test_preds, target_names=test_data.class_indices))
print(metrics.classification_report(test_data.labels, y_test_preds, target_names=test_data.class_indices))
precision recall f1-score support chicken_curry 0.47 0.43 0.45 250 chicken_wings 0.47 0.28 0.35 250 fried_rice 0.82 0.16 0.27 250 grilled_salmon 0.33 0.12 0.17 250 hamburger 0.40 0.17 0.24 250 ice_cream 0.41 0.52 0.46 250 pizza 0.32 0.66 0.43 250 ramen 0.48 0.46 0.47 250 steak 0.34 0.78 0.48 250 sushi 0.27 0.24 0.25 250 accuracy 0.38 2500 macro avg 0.43 0.38 0.36 2500 weighted avg 0.43 0.38 0.36 2500
Model 5: TinyVGG - Extra Conv - Dense - BatchNorm¶
ReduceLROnPlateau
¶
In [ ]:
Copied!
reduce_lr = callbacks.ReduceLROnPlateau(patience=2)
reduce_lr = callbacks.ReduceLROnPlateau(patience=2)
Clone and recompile¶
In [ ]:
Copied!
model = tf.keras.models.clone_model(model)
model._name = 'TinyVGG-Extra-Conv-BatchNorm-Dense-ReduceLROnPlateau'
model.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
model = tf.keras.models.clone_model(model)
model._name = 'TinyVGG-Extra-Conv-BatchNorm-Dense-ReduceLROnPlateau'
model.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(), metrics=[KerasMetrics.f1, 'accuracy'])
Fit the model¶
In [ ]:
Copied!
history = model.fit(train_data, steps_per_epoch=len(trainsave_modelata),
validation_data=validation_data, validation_steps=len(validation_data), epochs=20, callbacks=[reduce_lr])
tfmodels[model.name] = model
history = model.fit(train_data, steps_per_epoch=len(trainsave_modelata),
validation_data=validation_data, validation_steps=len(validation_data), epochs=20, callbacks=[reduce_lr])
tfmodels[model.name] = model
Epoch 1/20 211/211 [==============================] - 180s 855ms/step - loss: 2.2594 - f1: 0.0237 - accuracy: 0.1686 - val_loss: 8.7019 - val_f1: 0.0994 - val_accuracy: 0.1000 Epoch 2/20 211/211 [==============================] - 188s 889ms/step - loss: 2.1589 - f1: 0.0312 - accuracy: 0.2095 - val_loss: 2.1958 - val_f1: 0.0479 - val_accuracy: 0.1840 Epoch 3/20 211/211 [==============================] - 254s 1s/step - loss: 2.0734 - f1: 0.0593 - accuracy: 0.2567 - val_loss: 2.0766 - val_f1: 0.0550 - val_accuracy: 0.2507 Epoch 4/20 211/211 [==============================] - 298s 1s/step - loss: 2.0093 - f1: 0.0964 - accuracy: 0.2904 - val_loss: 2.1358 - val_f1: 0.1456 - val_accuracy: 0.2360 Epoch 5/20 211/211 [==============================] - 238s 1s/step - loss: 1.9677 - f1: 0.1076 - accuracy: 0.3036 - val_loss: 2.1594 - val_f1: 0.1439 - val_accuracy: 0.2800 Epoch 6/20 211/211 [==============================] - 230s 1s/step - loss: 1.9224 - f1: 0.1303 - accuracy: 0.3255 - val_loss: 1.9277 - val_f1: 0.1365 - val_accuracy: 0.3200 Epoch 7/20 211/211 [==============================] - 246s 1s/step - loss: 1.8953 - f1: 0.1372 - accuracy: 0.3292 - val_loss: 1.9081 - val_f1: 0.1558 - val_accuracy: 0.3320 Epoch 8/20 211/211 [==============================] - 272s 1s/step - loss: 1.8709 - f1: 0.1575 - accuracy: 0.3385 - val_loss: 1.9205 - val_f1: 0.1524 - val_accuracy: 0.3507 Epoch 9/20 211/211 [==============================] - 281s 1s/step - loss: 1.8592 - f1: 0.1676 - accuracy: 0.3547 - val_loss: 1.9341 - val_f1: 0.1784 - val_accuracy: 0.3067 Epoch 10/20 211/211 [==============================] - 223s 1s/step - loss: 1.8534 - f1: 0.1721 - accuracy: 0.3514 - val_loss: 1.8895 - val_f1: 0.1795 - val_accuracy: 0.3400 Epoch 11/20 211/211 [==============================] - 202s 957ms/step - loss: 1.8521 - f1: 0.1710 - accuracy: 0.3526 - val_loss: 1.9190 - val_f1: 0.1746 - val_accuracy: 0.3253 Epoch 12/20 211/211 [==============================] - 153s 724ms/step - loss: 1.8422 - f1: 0.1742 - accuracy: 0.3532 - val_loss: 1.9383 - val_f1: 0.1740 - val_accuracy: 0.3213 Epoch 13/20 211/211 [==============================] - 214s 1s/step - loss: 1.8379 - f1: 0.1747 - accuracy: 0.3520 - val_loss: 1.8922 - val_f1: 0.1705 - val_accuracy: 0.3187 Epoch 14/20 211/211 [==============================] - 236s 1s/step - loss: 1.8461 - f1: 0.1736 - accuracy: 0.3539 - val_loss: 1.8975 - val_f1: 0.1645 - val_accuracy: 0.3227 Epoch 15/20 211/211 [==============================] - 235s 1s/step - loss: 1.8448 - f1: 0.1752 - accuracy: 0.3530 - val_loss: 1.8839 - val_f1: 0.1828 - val_accuracy: 0.3440 Epoch 16/20 211/211 [==============================] - 236s 1s/step - loss: 1.8404 - f1: 0.1702 - accuracy: 0.3486 - val_loss: 1.8914 - val_f1: 0.1728 - val_accuracy: 0.3347 Epoch 17/20 211/211 [==============================] - 189s 894ms/step - loss: 1.8491 - f1: 0.1754 - accuracy: 0.3479 - val_loss: 1.9084 - val_f1: 0.1576 - val_accuracy: 0.3253 Epoch 18/20 211/211 [==============================] - 239s 1s/step - loss: 1.8456 - f1: 0.1740 - accuracy: 0.3461 - val_loss: 1.8775 - val_f1: 0.1857 - val_accuracy: 0.3413 Epoch 19/20 211/211 [==============================] - 222s 1s/step - loss: 1.8490 - f1: 0.1656 - accuracy: 0.3538 - val_loss: 1.9158 - val_f1: 0.1757 - val_accuracy: 0.3333 Epoch 20/20 211/211 [==============================] - 184s 869ms/step - loss: 1.8462 - f1: 0.1715 - accuracy: 0.3499 - val_loss: 1.9033 - val_f1: 0.1604 - val_accuracy: 0.3333
Adjusting model parameters for better performance¶
- Get more data
- Simplify the model (Overfitting) or Complexify the model (Underfitting)
- Regularize the model
- Adjust the learning rate, optimizer
- Use Data Augmentation
- Use transfer learning
In [ ]:
Copied!
imgdir.view_random_prediction(tfmodels['TinyVGG-Extra-Conv-BatchNorm-Dense'], subset='train', datagen=test_datagen);
imgdir.view_random_prediction(tfmodels['TinyVGG-Extra-Conv-BatchNorm-Dense'], subset='train', datagen=test_datagen);
Save the models¶
In [ ]:
Copied!
for name, model in tfmodels.items():
tf.keras.models.save_model(model, f'../models/10_food_multiclass_classification/{name}')
for name, model in tfmodels.items():
tf.keras.models.save_model(model, f'../models/10_food_multiclass_classification/{name}')