Image Scenery Classification
This model is built on the efficientnet_b2 architecture.
The model uses pretrained weights of the model found in the torchvision.models library.
The classification head was changed, to be a dropout layer, followed by a linear layer with 6 target classes.
Using transfer learning, the model was then trained on the intel image dataset.
See the corresponding hugging face space for a live demo of the model.
Performance
The model achieved a test accuracy of 89,67%.
Misclassified images are often times ambiguous, such as a snowy mountain, being misclassified as 'glacier'.
The model architecture is quite simple, when compared to SOTA architectures and produces fast predictions.
A prediction on the hugging face space, hosted on the free cpu, takes about 0.2 seconds.
The code is original and written by me.