This project creates a Convolutional Neural Network to detect a hand in video. For this project, I wanted to come up with a simple example to get more experience working with Convolutional Neural Networks (CNN) since I never used CNNs specifically in graduate school. I decided I wanted to work on a project from scratch, but I also wanted to keep it fairly simple. I wanted to start with nothing, create my own image data set, and use it in a model.

What did I do in this project?

I decided I would make a model for detecting my hand using my webcam. The idea was to draw a bounding box around my hand in videos captured using the webcam. After some initial research, I discovered one of the common techniques for starting with a CNN uses transfer learning. Rather than training a network from scratch, a pre trained model is used for the first portion of the network, and new trainable layers are appended to the end of it. This speeds up the training time since the pre trained network has already been trained. For this project, I used the pre trained VGG19 model that is included with Keras.

Results

Overall, the project was a success. Some sample videos can be seen below. These videos were not part of the training/validation/test set.

Final Thoughts

This project was done as a learning experience. Since the model was trained using only images from my webcam and images of myself, it doesn’t perform well on videos of other people. In order to achieve such performance, many more training data of different cameras, subjects, and environments would be needed. Given that this was just a project to practice CNNs and transfer learning, I don’t intend to pursue it further.

The Notebook

The notebook is embedded below. However, it’s best viewed at nbviewer.org. The project can be found on my Github.