File size: 1,442 Bytes
c38c982 4949dd1 c38c982 a8b10e5 c38c982 bef85e9 c38c982 bef85e9 4949dd1 c38c982 a8b10e5 c38c982 df1e7db a8b10e5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
---
license: mit
language:
- en
pipeline_tag: video-classification
---
# Model Card for UniformerV2
<!-- Provide a quick summary of what the model is/does. -->
UniformerV2 is a large transformer-based model trained on a binary classification task. Specifically, it is trained to detect whether the input video contains a chimpanzee(s) exhibiting a reaction to the presence of a camera trap.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
UniformerV2 is a large transformer-based model trained on a binary classification task. Specifically, it is trained to detect whether the input video contains a chimpanzee(s) exhibiting a reaction to the presence of a camera trap. As the dataset heavily favours videos exhibiting no reaction to the camera, we employ a class-balanced focal loss to address the class imbalance.
- **Developed by:** Otto Brookes, Christophe Boesch, Hjalmar S. Kühl, Majid Mirmehdi, Tilo Burghardt
- **Model type:** Vision Transformer, UniformerV2
- **License:** MIT
## Training Details
### Training Data
It is trained on camera trap video footage from 15 different countries in Africa as part of the The Pan African Programme: The Cultured Chimpanzee.
### Results
We use mean average precision to evaluate models
| Dataset | Model | Loss | mAP (%) |
|-----------|------------|------------|---------|
| PanAf | Uniformer | CB Focal | 87.82% | |