Classifying Food with Computer Vision 🍞

An in-depth tutorial on identifying food with CNNs (using Keras) 📸

Ariel Liu
8 min readSep 25, 2021

Who doesn’t love good food?

A nice coffee in the morning starts your day on a pleasant note.

A hearty soup at lunch soothes the soul.

And a tasty dessert in the evening ties up your day.

Photo by Joyful on Unsplash

Yum!

Unfortunately, sometimes the food we want to eat comes with downsides. I can speak from personal experience: being unaware of the things I am ingesting into my body, leads to quite the surprise next morning on the scale.

And not really the good kind of surprise that we want.

This is why we created CalorMe: an app that allows users to take photos of their food and for the AI to quickly return all the nutritional information about that dish.

I’m not going to go into too much depth about the other aspects of CalorMe, but if you’re interested you could definitely read more here. Or check out the code for the app here.

Instead, today I’m here to talk to you about how I created and trained the CNN to classify different foods, and how easy it is for you to do so as well!

Heads up: the rest of this article is going to be in more of a tutorial-like style. There might be some jargon or terms potentially confusing to people completely new to AI. So here’s a few resources:

Here’s a quick introduction to machine learning.

Here’s an article on convolutional neural networks (used for computer vision).

All the code is available on my GitHub repository here.

Table of Contents

1. Preparing the Data
1.1 Getting the data from Kaggle
1.2 Setting up training and testing folders
1.3 Using the ImageDataGenerator
2. Building our Model
2.1 Applying Transfer Learning Using InceptionV3
2.2 Building the Dense Head
3. Training our model
3.1 Compiling our model
3.2 Define Callbacks
3.3 Fit our model to the data
4. Analyzing and Testing Our Models
4.1 Analyzing our model's performance

Without further ado, let’s get into it!

1. Preparing the Data

1.1 Getting the data from Kaggle

I found a good food data set on Kaggle: https://www.kaggle.com/kmader/food41

It contains 101 different foods ranging from baklava to deviled eggs, and sashimi.

The first thing you need to do is select the images folder then click the download button (circled in red on the top right). This will download the images as a zip that you’ll have to extract.

Once everything is extracted, copy it over to the jupyter notebook directory where you’re planning to create the rest of your project. You could also upload it from jupyter notebooks using the upload button on the top right:

I prefer to use the file manager to copy the files into the working directory. Either way, you just need to end up with the images file stored somewhere you can access it with jupyter notebooks.

Then I downloaded the test.txt and train.txt files from the Kaggle data set and moved them into the same directory.

downloading test and train txt
test and train txt files copied into working directory

Now that we have the files ready, we can start working with them in jupyter notebook.

1.2 Setting up training and testing folders

I created and ran a function that copies the images and folders that we’re going to use into their corresponding train and test folders. Since the Kaggle data set provided train and test txt files with the directories of each image, we can read the directories from the files and based on that we can copy the images into train and test folders.

Function for setting up train and test folders

Below, I’m running the function to create the train and test folders. (I just used the default parameters I coded into the function: cheesecake, baklava, and ramen.)

setup_training_data('meta/train.txt', 'images', 'train')setup_training_data('meta/test.txt', 'images', 'test')
The inside of the created train folder after running the function

This example classifies three different foods. Of course, you can totally train the model to identify many more types of food! (Fair warning, this model will take half an hour to train and this will only increase with more data.)

Simply change the list of which images you are saving in setup_training_data, change the number of units in the last Dense layer to match the number of unique food types and perhaps add more layers in the Sequential model. You can also check out my other examples that label more food types here.

An example of a piece of the training data

Complete code for 1.2 Setting up training and testing folders below:

Article: Classifying food — setuptrainingdata function

Our next step is preprocessing the image using the Keras ImageDataGenerator!

1.3 Using the ImageDataGenerator

I used the ImageDataGenerator to preprocess and resize the images for training. You can also use it to do image augmentation, which refers to altering existing image data for training. Like shifting an image to the right, rotating it, or changing the brightness. This helps the model adapt to data in the deployment, and recognize the same food even if it’s rotated or flipped!

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
data_gen = ImageDataGenerator(rotation_range=30, rescale=1./255, validation_split=0.2)

I start off by importing the needed models and creating the data generator. I set the generator to augment the data through rotation up to a factor of 30.

image_size = (235, 235)train_gen = data_gen.flow_from_directory('train', target_size=image_size) # class_mode default categorical
test_gen = data_gen.flow_from_directory('test', target_size=image_size) # default batch_size is 32

I apply this on the train and test folders, so now our data is ready to be fed into a model!

The code for this section is below, and the complete code is on my github here.

Code for 1.3 ImageDataGenerator

2. Building our Model

2.1 Applying Transfer Learning Using InceptionV3

Most convolutional neural networks use the base of a pre-trained model for computer vision. This means that they use a base with trained weights.

We take a model that has already been trained on image data and attach this base to an untrained head. This way, the base retains it’s weights that allow it to extract features and we don’t need to train it from scratch, we can just attach it to fresh Dense layers to train the model to classify these features.

In summary:

  • Base → extract features with convolutional base
  • Head → identify class with dense head

Keras offers many pre-trained models, and you can find a full list here. I tested out different base models like InceptionV3, VGG16, and MobileNetV2. I found InceptionV3 to perform the best, so it’s the one I’m using here.

First things first, import the InceptionV3 model from Keras. Then set the weights to the pre-trained imagenet weights and set the input shape to the width and height of the image. Finally, we got to make sure that we’re freezing the weights since we don’t want to have to retrain our base.

2.2 Building the Dense Head

Import needed modules and create a new Sequential model. We start with the inception model base that we created and then start creating our head below it.

I added a GlobalAveragePooling layer to greatly reduce the amount of information doing into the Dense layers. I also added a 30% dropout layer. They help prevent over-fitting by “dropping out” certain neurons and making sure the model isn’t over-relying on specific neurons. Make sure to end off with a Dense layer that has the same amount of units as the amount of different classes.

model.summary()

3. Training our model

3.1 Compiling our model

On this step we need to define the optimizer, the loss function, and the metrics. For our optimizer we’ll be using Adam, a stochastic gradient descent algorithm. It’s a good general optimizer.

The loss function calculates the disparity between the models predictions and the actual label. Since we’re doing multi-class classification it would be categorical crossentropy. (For example, a regression problem might use like mean absolute error.)

3.2 Define Callbacks

Callbacks are functions that run during the training of the model. In our case, early stopping will stop training the model if it doesn’t show the minimum amount of improvement (min_delta=0.001) within the time frame we set (5 epochs).

The checkpoint function allows us to periodically save checkpoints of our model during training.

3.3 Fit our model to the data

Finally we get to the part where we actually train our model. We fit our model to the training and testing data, making sure to add the callbacks we defined prior.

Note: Since we have early stopping, it’s better to run for more epochs and have it stop on it’s own once the accuracy stops increasing.

4. Analyzing and Testing Our Models

4.1 Analyzing our model’s performance

Convert “history” into a pandas DataFrame so that we can use the built in function to quickly plot the change in accuracy and loss over time.

Accuracy and loss during training

🎉 Overall, the model has a pretty good learning curve. We don’t see too much overfitting or underfitting.

We were able to achieve an accuracy of 96% on the testing data which is pretty solid! Good work! 🙌

Check out the full code for this article here: https://github.com/arielycliu/Articles/tree/master/Classifying%20Food

Check out the AI portion of CalorMe: https://github.com/arielycliu/CalorMeAI

Thanks for reading my article! I hope you learned something or got something new to think about!

If you want to read more articles in the future, give my account a follow!

In the meantime, feel free to contact me at ariel.yc.liu@gmail.com or connect with me on LinkedIn.

Or check out my Github here!

Till next time! đź‘‹

--

--

Ariel Liu

A machine learning enthusiast who’s always learning~