Upload 2 files
Browse files- CSTRACK_S2_ep0050.pth.tar +3 -0
- README.md +228 -3
CSTRACK_S2_ep0050.pth.tar
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6ca75f2796999ae62cd31a65ee968826eb2587f60040b6a92a64d2f24b124b5e
|
3 |
+
size 383756049
|
README.md
CHANGED
@@ -1,3 +1,228 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features
|
2 |
+
|
3 |
+
> [Xiaokun Feng](https://scholar.google.com.hk/citations?user=NqXtIPIAAAAJ), [Dailing Zhang](https://scholar.google.com.hk/citations?user=ApH4wOcAAAAJ), [Shiyu Hu](https://huuuuusy.github.io/), [Xuchen Li](https://github.com/Xuchen-Li), [Meiqi Wu](https://scholar.google.com.hk/citations?user=fGc7NVAAAAAJ), [Jing Zhang](https://github.com/XiaokunFeng/CSTrack), [Xiaotang Chen](http://www.ia.cas.cn/rcdw/fyjy/202404/t20240422_7129814.html), [Kaiqi Huang](https://people.ucas.ac.cn/~0004554)
|
4 |
+
|
5 |
+
|
6 |
+
[](https://huggingface.co/Xiaokunfeng2022/CSTrack)
|
7 |
+
|
8 |
+
This is an official pytorch implementation of the paper **CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features**.
|
9 |
+
|
10 |
+
|
11 |
+
## 🔥 Updates
|
12 |
+
|
13 |
+
* \[5/2024\] **CStrack's** code is available!
|
14 |
+
* \[5/2024\] **CStrack** is accepted by ICML25!
|
15 |
+
|
16 |
+
## 📣 Overview
|
17 |
+
### Our motivation & Core modeling approach
|
18 |
+

|
19 |
+
Effectively modeling and utilizing spatiotemporal features from RGB and other modalities (e.g., depth, thermal, and event data, denoted as X) is the core of RGB-X tracker design.
|
20 |
+
Existing methods often employ two parallel branches to separately process the RGB and X input streams, requiring the model to simultaneously handle two dispersed feature spaces, which complicates both the model structure and computation process.
|
21 |
+
More critically, intra-modality spatial modeling within each dispersed space incurs substantial computational overhead, limiting resources for inter-modality spatial modeling and temporal modeling.
|
22 |
+
To address this, we propose a novel tracker, **CSTrack**, which focuses on modeling **C**ompact **S**patiotemporal features to achieve simple yet effective tracking.
|
23 |
+
Specifically, we first introduce an innovative **Spatial Compact Module** that integrates the RGB-X dual input streams into a compact spatial feature, enabling thorough intra- and inter-modality spatial modeling.
|
24 |
+
Additionally, we design an efficient **Temporal Compact Module** that compactly represents temporal features by constructing the refined target distribution heatmap.
|
25 |
+
Extensive experiments validate the effectiveness of our compact spatiotemporal modeling method, with CSTrack achieving new SOTA results on mainstream RGB-X benchmarks.
|
26 |
+
|
27 |
+
|
28 |
+
### Strong performance
|
29 |
+
|
30 |
+
| Tracker | LasHeR (AUC) | RGBT234 (MSR) | VOT-RGBD22 (EAO) | DepthTrack (F-score) | VisEvent (AUC) |
|
31 |
+
|------------|--------------|---------------|------------------|----------------------|----------------|
|
32 |
+
| **CSTrack** | **60.8** | **70.9** | **77.4** | **65.8** | **65.2** |
|
33 |
+
| SDSTrack | 53.1 | 62.5 | 72.8 | 61.9 | 59.7 |
|
34 |
+
| OneTracker | 53.8 | 62.5 | 72.7 | 60.9 | 60.8 |
|
35 |
+
| ViPT | 52.5 | 61.7 | 72.1 | 59.6 | 59.2 |
|
36 |
+
| UnTrack | 51.3 | 62.5 | 72.1 | 61.0 | 58.9 |
|
37 |
+
|
38 |
+
|
39 |
+
|
40 |
+
|
41 |
+
## 🔨 Installation
|
42 |
+
```
|
43 |
+
conda create -n cstrack python=3.8
|
44 |
+
conda activate cstrack
|
45 |
+
bash install.sh
|
46 |
+
```
|
47 |
+
|
48 |
+
## 🔧 Usage
|
49 |
+
|
50 |
+
### Data Preparation
|
51 |
+
Our CSTrack is jointly trained on RGB and RGB-D/T/E datasets.
|
52 |
+
Put these tracking datasets in [./data](data). It should look like:
|
53 |
+
|
54 |
+
For RGB datasets:
|
55 |
+
```
|
56 |
+
${CSTrack_ROOT}
|
57 |
+
-- data
|
58 |
+
-- lasot
|
59 |
+
|-- airplane
|
60 |
+
|-- basketball
|
61 |
+
|-- bear
|
62 |
+
...
|
63 |
+
-- got10k
|
64 |
+
|-- test
|
65 |
+
|-- train
|
66 |
+
|-- val
|
67 |
+
-- coco
|
68 |
+
|-- annotations
|
69 |
+
|-- images
|
70 |
+
-- trackingnet
|
71 |
+
|-- TRAIN_0
|
72 |
+
|-- TRAIN_1
|
73 |
+
...
|
74 |
+
|-- TRAIN_11
|
75 |
+
|-- TEST
|
76 |
+
-- VastTrack
|
77 |
+
|-- unisot_train_final_backup
|
78 |
+
|-- Aardwolf
|
79 |
+
...
|
80 |
+
|-- Zither
|
81 |
+
|-- unisot_final_test
|
82 |
+
|-- Aardwolf
|
83 |
+
...
|
84 |
+
|-- Zither
|
85 |
+
-- tnl2k
|
86 |
+
-- train
|
87 |
+
|-- Arrow_Video_ZZ04_done
|
88 |
+
|-- Assassin_video_1-Done
|
89 |
+
|-- Assassin_video_2-Done
|
90 |
+
...
|
91 |
+
-- test
|
92 |
+
|-- advSamp_Baseball_game_002-Done
|
93 |
+
|-- advSamp_Baseball_video_01-Done
|
94 |
+
|-- advSamp_Baseball_video_02-Done
|
95 |
+
...
|
96 |
+
|
97 |
+
```
|
98 |
+
For RGB-D/T/E datasets:
|
99 |
+
```
|
100 |
+
${CSTrack_ROOT}
|
101 |
+
-- data
|
102 |
+
-- depthtrack
|
103 |
+
-- train
|
104 |
+
|-- adapter02_indoor
|
105 |
+
|-- bag03_indoor
|
106 |
+
|-- bag04_indoor
|
107 |
+
...
|
108 |
+
-- lasher
|
109 |
+
-- trainingset
|
110 |
+
|-- 1boygo
|
111 |
+
|-- 1handsth
|
112 |
+
|-- 1phoneblue
|
113 |
+
...
|
114 |
+
-- testingset
|
115 |
+
|-- 1blackteacher
|
116 |
+
|-- 1boycoming
|
117 |
+
|-- 1stcol4thboy
|
118 |
+
...
|
119 |
+
-- visevent
|
120 |
+
-- train
|
121 |
+
|-- 00142_tank_outdoor2
|
122 |
+
|-- 00143_tank_outdoor2
|
123 |
+
|-- 00144_tank_outdoor2
|
124 |
+
...
|
125 |
+
-- test
|
126 |
+
|-- 00141_tank_outdoor2
|
127 |
+
|-- 00147_tank_outdoor2
|
128 |
+
|-- 00197_driving_outdoor3
|
129 |
+
...
|
130 |
+
-- annos
|
131 |
+
```
|
132 |
+
### Set project paths
|
133 |
+
Run the following command to set paths for this project
|
134 |
+
```
|
135 |
+
python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .
|
136 |
+
```
|
137 |
+
After running this command, you can also modify paths by editing these two files
|
138 |
+
```
|
139 |
+
lib/train/admin/local.py # paths about training
|
140 |
+
lib/test/evaluation/local.py # paths about testing
|
141 |
+
```
|
142 |
+
|
143 |
+
### Train
|
144 |
+
#### Prepare pretrained backbone
|
145 |
+
The backbone and patch embedding of CSTrack are initialized with pre-trained weights from [**Fast-iTPN**](https://github.com/sunsmarterjie/iTPN).
|
146 |
+
Please download the **fast_itpn_base_clipl_e1600.pt** checkpoint and place it in [./resource/pretrained_models](./resource/pretrained_models).
|
147 |
+
|
148 |
+
#### Stage1: Train CSTrack without Temporal Compact Module
|
149 |
+
```
|
150 |
+
python -m torch.distributed.launch --nproc_per_node 4 lib/train/run_training.py --script cstrack_s1 --config cstrack_s1 --save_dir .
|
151 |
+
```
|
152 |
+
|
153 |
+
|
154 |
+
#### Stage2: Train CSTrack with Temporal Compact Module
|
155 |
+
Then, run the following command:
|
156 |
+
First, you need to set the **PRETRAINED_PATH** in [./experiments/cstrack_s2/cstrack_s2.yaml](./experiments/cstrack_s2/cstrack_s2.yaml) to the path of the model weights obtained from Stage 1 training.
|
157 |
+
Then, run the following command:
|
158 |
+
|
159 |
+
```
|
160 |
+
python -m torch.distributed.launch --nproc_per_node 4 lib/train/run_training.py --script cstrack_s2 --config cstrack_s2 --save_dir .
|
161 |
+
```
|
162 |
+
|
163 |
+
|
164 |
+
### Test and evaluate on benchmarks
|
165 |
+
First, you need to set **settings.checkpoints_path** in [./lib/test/evaluation/local.py](./lib/test/evaluation/local.py) to the dir where the model checkpoint to be evaluated is stored.
|
166 |
+
Then, run the following command to perform evaluation on different benchmarks.
|
167 |
+
|
168 |
+
- LasHeR
|
169 |
+
```
|
170 |
+
# For stage1
|
171 |
+
python ./RGBT_workspace/test_rgbt_mgpus.py --script_name cstrack_s1 --dataset_name LasHeR --yaml_name cstrack_s1
|
172 |
+
|
173 |
+
# For stage2
|
174 |
+
python ./RGBT_workspace/test_rgbt_mgpus.py --script_name cstrack_s2 --dataset_name LasHeR --yaml_name cstrack_s2
|
175 |
+
|
176 |
+
# Through this command, you can obtain the tracking result. Then, please use the official matlab toolkit to evaluate the result.
|
177 |
+
```
|
178 |
+
- RGBT-234
|
179 |
+
```
|
180 |
+
# For stage1
|
181 |
+
python ./RGBT_workspace/test_rgbt_mgpus.py --script_name cstrack_s1 --dataset_name RGBT234 --yaml_name cstrack_s1
|
182 |
+
|
183 |
+
# For stage2
|
184 |
+
python ./RGBT_workspace/test_rgbt_mgpus.py --script_name cstrack_s2 --dataset_name RGBT234 --yaml_name cstrack_s2
|
185 |
+
|
186 |
+
# Through this command, you can obtain the tracking result. Then, please use the official matlab toolkit to evaluate the result.
|
187 |
+
```
|
188 |
+
- VisEvent
|
189 |
+
```
|
190 |
+
# For stage1
|
191 |
+
python ./RGBE_workspace/test_rgbe_mgpus.py --script_name cstrack_s1 --dataset_name VisEvent --yaml_name cstrack_s1
|
192 |
+
|
193 |
+
# For stage2
|
194 |
+
python ./RGBE_workspace/test_rgbe_mgpus.py --script_name cstrack_s2 --dataset_name VisEvent --yaml_name cstrack_s2
|
195 |
+
# Through this command, you can obtain the tracking result. Then, please use the official matlab toolkit to evaluate the result.
|
196 |
+
```
|
197 |
+
- DepthTrack
|
198 |
+
```
|
199 |
+
cd Depthtrack_workspace
|
200 |
+
# For stage1
|
201 |
+
vot evaluate cstrack_s1
|
202 |
+
vot analysis cstrack_s1 --nocache
|
203 |
+
|
204 |
+
# For stage2
|
205 |
+
vot evaluate cstrack_s2
|
206 |
+
vot analysis cstrack_s2 --nocache
|
207 |
+
```
|
208 |
+
- VOT-RGBD22
|
209 |
+
```
|
210 |
+
cd VOT22RGBD_workspace
|
211 |
+
# For stage1
|
212 |
+
vot evaluate cstrack_s1
|
213 |
+
vot analysis cstrack_s1 --nocache
|
214 |
+
|
215 |
+
# For stage2
|
216 |
+
vot evaluate cstrack_s2
|
217 |
+
vot analysis cstrack_s2 --nocache
|
218 |
+
```
|
219 |
+
|
220 |
+
## 📊 Model Zoo
|
221 |
+
The trained models, and the raw tracking results are provided in the [](https://huggingface.co/Xiaokunfeng2022/CSTrack)
|
222 |
+
|
223 |
+
## ❤️Acknowledgement
|
224 |
+
We would like to express our gratitude to the following open-source repositories that our work is based on: [SeqtrackV2](https://github.com/chenxin-dlut/SeqTrackv2), [SUTrack](https://github.com/chenxin-dlut/SUTrack), [ViPT](https://github.com/jiawen-zhu/ViPT), [AQATrack](https://github.com/GXNU-ZhongLab/AQATrack), [iTPN](https://github.com/sunsmarterjie/iTPN).
|
225 |
+
Their contributions have been invaluable to this project.
|
226 |
+
|
227 |
+
|
228 |
+
|