This is a project I made to classify images using a convolutional neural network (CNN) trained on the CIFAR-10 dataset. It can recognize 10 types of objects: airplane, car, bird, cat, deer, dog, frog, horse, ship, and truck.
It has a web interface where you upload an image and it shows you what the model thinks it is, along with the confidence percentage for each class.
Install the required libraries first:
pip install tensorflow opencv-python matplotlib scikit-learn seaborn flask flask-corsThen run the web interface:
python main.py --serverIt will open the browser automatically at http://localhost:5000. Upload any image and the model will classify it.
Other ways to run it:
# Classify a single image and see the result in the terminal
python main.py --image testpic.jpg
# Train a new model from scratch (takes a while)
python main.py --trainmain.py the main code — model, training, prediction, server
index.html the web page
image_classifier.keras the trained model
testpic.jpg / testpic1.jpg / testpic2.jpg test images
The model is a CNN with 3 blocks. Each block has two convolutional layers followed by pooling and dropout. Then there is a dense layer at the end that outputs probabilities for the 10 classes.
I added a few things to make it better:
- BatchNormalization — makes training more stable
- Dropout — randomly turns off some neurons during training so the model doesn't just memorize the data
- Data Augmentation — randomly flips and rotates training images so the model sees more variety
- Early Stopping — automatically stops training if the model stops improving
The model gets around 63% accuracy on the test set. CIFAR-10 images are only 32x32 pixels so some classes like cat and dog are hard to tell apart even for humans.
The model sometimes gives wrong predictions when you upload a real photo. For example it might say dog on an image that is clearly something else.
This happens for two reasons. First, CIFAR-10 images are only 32x32 pixels. When you upload a real photo it gets resized down to that size and a lot of detail gets lost in the process. Second, the model was only trained on the CIFAR-10 dataset which has a very specific style of images. It has never seen real-world photos so it struggles with that difference.
This is a known limitation of the project and not something I plan to fix for now. The point was to learn how the full pipeline works from training to deployment, not to build a perfect classifier. A proper fix would involve using a pretrained model like MobileNetV2 or ResNet that was already trained on millions of real images, which is a more advanced topic.
This model can only output one class per image. If you upload a photo that has two objects in it, for example a dog and a cat together, it will still pick only one answer — whichever object it sees more strongly in the image.
You will notice a few things in this case. The confidence score will be lower than usual, the uncertainty score will be higher, and the bar chart will show two classes with close percentages instead of one clearly dominant bar. For example it might show dog at 48% and cat at 32%, which means the model is unsure and is split between the two.
This is a limitation of image classification in general. The model was trained on CIFAR-10 where every image has exactly one object, so it only learned to answer "what is the main object in this image" and not "what are all the objects." It cannot detect or label multiple things at the same time.
If you want to properly handle images with multiple objects you would need a different type of model called an object detection model. Models like YOLO or Faster R-CNN can draw a box around each object they find and label each one separately. That is a more advanced topic and a good next step after understanding classification.
- The predicted class and confidence percentage
- A bar chart with the probability for each of the 10 classes
- An uncertainty score (entropy) — if the model is not sure, this will be high
- A history of your last few predictions
CIFAR-10 is a well-known dataset with 60,000 images split into 10 classes. 50,000 are used for training and 10,000 for testing. It gets downloaded automatically the first time you run the code.