wild-chimpanzee-foundation
/

uniformerv2_large-clip-k710-pre-k400_cb-focal-loss

Video Classification

Model card Files Files and versions Community

uniformerv2_large-clip-k710-pre-k400_cb-focal-loss / README.md

obrookes's picture

Update README.md

df1e7db verified about 1 year ago

|

history blame contribute delete

1.44 kB

	---
	license: mit
	language:
	- en
	pipeline_tag: video-classification
	---

	# Model Card for UniformerV2

	<!-- Provide a quick summary of what the model is/does. -->
	UniformerV2 is a large transformer-based model trained on a binary classification task. Specifically, it is trained to detect whether the input video contains a chimpanzee(s) exhibiting a reaction to the presence of a camera trap.

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->
	UniformerV2 is a large transformer-based model trained on a binary classification task. Specifically, it is trained to detect whether the input video contains a chimpanzee(s) exhibiting a reaction to the presence of a camera trap. As the dataset heavily favours videos exhibiting no reaction to the camera, we employ a class-balanced focal loss to address the class imbalance.

	- Developed by: Otto Brookes, Christophe Boesch, Hjalmar S. Kühl, Majid Mirmehdi, Tilo Burghardt
	- Model type: Vision Transformer, UniformerV2
	- License: MIT

	## Training Details

	### Training Data
	It is trained on camera trap video footage from 15 different countries in Africa as part of the The Pan African Programme: The Cultured Chimpanzee.

	### Results
	We use mean average precision to evaluate models
	\| Dataset \| Model \| Loss \| mAP (%) \|
	\|-----------\|------------\|------------\|---------\|
	\| PanAf \| Uniformer \| CB Focal \| 87.82% \|