nv-bschifferer commited on
Commit
780d274
·
1 Parent(s): a55c3ec

adding license

Browse files
README.md CHANGED
@@ -27,7 +27,7 @@ The **nvidia/llama-nemoretriever-colembed-1b-v1** is a late interaction embeddin
27
  This model is for non-commercial/research use only.
28
 
29
  ### License/Terms of Use
30
- Governing Terms: [NVIDIA License](https://huggingface.co/nvidia/llama-nemoretriever-colembed-1b-v1/blob/main/LICENSE)
31
  Additional Information: [Apache License 2.0](https://choosealicense.com/licenses/apache-2.0/) for [siglip2-giant-opt-patch16-384](https://huggingface.co/google/siglip2-giant-opt-patch16-384); and [LLAMA 3.2 Community License Agreement](https://huggingface.co/meta-llama/Llama-3.2-1B/blob/main/LICENSE.txt) for [Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B). Built with Meta Llama 3. Improved using Qwen.
32
 
33
  This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
@@ -37,6 +37,7 @@ This project will download and install additional third-party open source softwa
37
  - Gabriel Moreira
38
  - Radek Osmulski
39
  - Ronay Ak
 
40
  - Even Oldridge
41
  - Benedikt Schifferer
42
 
@@ -119,7 +120,7 @@ model = AutoModel.from_pretrained(
119
  trust_remote_code=True,
120
  torch_dtype=torch.bfloat16,
121
  attn_implementation="flash_attention_2",
122
- revision='6a21313a150a903bc522dc0d15ed47784a0d4c8d'
123
  ).eval()
124
 
125
  # Queries
@@ -165,7 +166,7 @@ The HuggingFace model artifact contains a [script](https://huggingface.co/nvidia
165
  pip install git+https://github.com/illuin-tech/vidore-benchmark@e0eb9032e7e00adc8aa6f9cb35d5a9371f67485a
166
  # Downgrade transformers as vidore will install latest transformers
167
  pip install transformers==4.49.0
168
- CUDA_VISIBLE_DEVICES=0; python3 vidore_eval.py --model_name_or_path nvidia/llama-nemoretriever-colembed-1b-v1 --savedir_datasets ./results/
169
  ```
170
 
171
  The HuggingFace model artifact contains a [script](https://huggingface.co/nvidia/llama-nemoretriever-colembed-1b-v1/blob/main/mteb_eval.py) to evaluate MTEB VisualDocumentRetrieval. We install ViDoRe benchmark to capture dependencies, first.
@@ -185,7 +186,7 @@ Supported Hardware Microarchitecture Compatibility: A100 40GB, A100 80GB, H100 8
185
  Supported Operating System(s): Linux
186
 
187
  ## Model Version(s)
188
- **llama-NemoRetriever-colembed-1b-v1**
189
 
190
  # Training and Evaluation Datasets
191
 
@@ -206,6 +207,17 @@ We evaluate the model on multiple benchmarks for Visual Document Retrieval, ViDo
206
  - **Labeling Method by dataset:** Hybrid: Automated, Human, Synthetic
207
  - **Properties:** More details on ViDoRe V1 and ViDoRe V2 can be found on their leaderboard. [Visual Document Retrieval Benchmark](https://huggingface.co/vidore), ViDoRe, is composed of various page-level retrieving tasks spanning multiple domains, languages, and settings.
208
 
 
 
 
 
 
 
 
 
 
 
 
209
  ## Inference:
210
  **Acceleration Engine:** Not Applicable <br>
211
  **Test Hardware:** A100 40GB, A100 80GB, H100 80GB
 
27
  This model is for non-commercial/research use only.
28
 
29
  ### License/Terms of Use
30
+ Governing Terms for llama-nemoretriever-colembed-1b-v1 model: [NVIDIA Non-Commercial License](https://huggingface.co/nvidia/llama-nemoretriever-colembed-1b-v1/blob/main/LICENSE)
31
  Additional Information: [Apache License 2.0](https://choosealicense.com/licenses/apache-2.0/) for [siglip2-giant-opt-patch16-384](https://huggingface.co/google/siglip2-giant-opt-patch16-384); and [LLAMA 3.2 Community License Agreement](https://huggingface.co/meta-llama/Llama-3.2-1B/blob/main/LICENSE.txt) for [Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B). Built with Meta Llama 3. Improved using Qwen.
32
 
33
  This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
 
37
  - Gabriel Moreira
38
  - Radek Osmulski
39
  - Ronay Ak
40
+ - Yauhen Babakhin
41
  - Even Oldridge
42
  - Benedikt Schifferer
43
 
 
120
  trust_remote_code=True,
121
  torch_dtype=torch.bfloat16,
122
  attn_implementation="flash_attention_2",
123
+ revision='1f0fdea7f5b19532a750be109b19072d719b8177'
124
  ).eval()
125
 
126
  # Queries
 
166
  pip install git+https://github.com/illuin-tech/vidore-benchmark@e0eb9032e7e00adc8aa6f9cb35d5a9371f67485a
167
  # Downgrade transformers as vidore will install latest transformers
168
  pip install transformers==4.49.0
169
+ CUDA_VISIBLE_DEVICES=0; python3 vidore_eval.py --model_name_or_path nvidia/llama-nemoretriever-colembed-1b-v1 --savedir_datasets ./results/ --model_revision 1f0fdea7f5b19532a750be109b19072d719b8177
170
  ```
171
 
172
  The HuggingFace model artifact contains a [script](https://huggingface.co/nvidia/llama-nemoretriever-colembed-1b-v1/blob/main/mteb_eval.py) to evaluate MTEB VisualDocumentRetrieval. We install ViDoRe benchmark to capture dependencies, first.
 
186
  Supported Operating System(s): Linux
187
 
188
  ## Model Version(s)
189
+ **llama-nemoretriever-colembed-1b-v1**
190
 
191
  # Training and Evaluation Datasets
192
 
 
207
  - **Labeling Method by dataset:** Hybrid: Automated, Human, Synthetic
208
  - **Properties:** More details on ViDoRe V1 and ViDoRe V2 can be found on their leaderboard. [Visual Document Retrieval Benchmark](https://huggingface.co/vidore), ViDoRe, is composed of various page-level retrieving tasks spanning multiple domains, languages, and settings.
209
 
210
+ | **Benchmark** | **Model 1B** | **Model 3B** |
211
+ |--------------------------------|--------------|--------------|
212
+ | ViDoRe V1 (06/27/2025) | 0.9050 | 0.9100 |
213
+ | ViDoRe V1 (deprecated) | 0.9049 | 0.9098 |
214
+ | ViDoRe V2 (06/27/2025) | 0.6209 | 0.6352 |
215
+ | ViDoRe V2 (deprecated) | 0.6261 | 0.6342 |
216
+ | MTEB Visual Document Retrieval | 0.8238 | 0.8315 |
217
+
218
+ Note: All scores are Avg. NDCG@5. ViDoRe V1 and V2 was updated on June 27th 2025 to use the calculated scores from [MTEB](https://github.com/embeddings-benchmark/mteb), which can result in slightly different scores. The ViDoRe V2 (06/27/2025) uses only 4 of the original 7 datasets.
219
+
220
+
221
  ## Inference:
222
  **Acceleration Engine:** Not Applicable <br>
223
  **Test Hardware:** A100 40GB, A100 80GB, H100 80GB
configuration_siglip.py CHANGED
@@ -1,7 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # --------------------------------------------------------
2
  # Copyright (c) 2025 NVIDIA
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
 
 
 
5
 
6
  """ Siglip model configuration"""
7
 
 
1
+ # coding=utf-8
2
+
3
+ # Copyright 2024 The HuggingFace Inc. team. All rights reserved.
4
+ #
5
+ # Licensed under the Apache License, Version 2.0 (the "License");
6
+ # you may not use this file except in compliance with the License.
7
+ # You may obtain a copy of the License at
8
+ #
9
+ # http://www.apache.org/licenses/LICENSE-2.0
10
+ #
11
+ # Unless required by applicable law or agreed to in writing, software
12
+ # distributed under the License is distributed on an "AS IS" BASIS,
13
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14
+ # See the License for the specific language governing permissions and
15
+
16
  # --------------------------------------------------------
17
  # Copyright (c) 2025 NVIDIA
18
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
19
  # --------------------------------------------------------
20
+ # Not a contribution
21
+ # Changes made by NVIDIA CORPORATION & AFFILIATES enabling llama-nemoretriever-colemebed models or otherwise documented as
22
+ # NSCLv1 are not a contribution and subject to the terms and conditions in LICENSE.md
23
 
24
  """ Siglip model configuration"""
25
 
flash_attention.py CHANGED
@@ -3,7 +3,9 @@
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
5
 
6
- # https://github.com/Dao-AILab/flash-attention/blob/v0.2.8/flash_attn/flash_attention.py
 
 
7
  import torch
8
  import torch.nn as nn
9
  from einops import rearrange
 
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
5
 
6
+ # Based on https://github.com/Dao-AILab/flash-attention/blob/v0.2.8/flash_attn/flash_attention.py
7
+ # https://github.com/Dao-AILab/flash-attention/blob/main/LICENSE
8
+
9
  import torch
10
  import torch.nn as nn
11
  from einops import rearrange
modeling_llama_nemoretrievercolembed.py CHANGED
@@ -3,7 +3,11 @@
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
5
 
 
 
 
6
  # Importing torch before transformers can cause `segmentation fault`
 
7
  from transformers import AutoTokenizer, AutoConfig
8
  from transformers.modeling_outputs import SequenceClassifierOutputWithPast
9
  import base64
 
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
5
 
6
+ # Based on https://github.com/OpenGVLab/InternVL/blob/main/streamlit_demo/model_worker.py
7
+ # https://github.com/OpenGVLab/InternVL/?tab=MIT-1-ov-file#readme
8
+
9
  # Importing torch before transformers can cause `segmentation fault`
10
+
11
  from transformers import AutoTokenizer, AutoConfig
12
  from transformers.modeling_outputs import SequenceClassifierOutputWithPast
13
  import base64
modeling_siglip.py CHANGED
@@ -1,7 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # --------------------------------------------------------
2
  # Copyright (c) 2025 NVIDIA
3
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
4
  # --------------------------------------------------------
 
 
 
 
 
5
  """ PyTorch Siglip model."""
6
 
7
 
 
1
+ # coding=utf-8
2
+ # Copyright 2024 The HuggingFace Inc. team. All rights reserved.
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License");
5
+ # you may not use this file except in compliance with the License.
6
+ # You may obtain a copy of the License at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # Unless required by applicable law or agreed to in writing, software
11
+ # distributed under the License is distributed on an "AS IS" BASIS,
12
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ # See the License for the specific language governing permissions and
14
+
15
+
16
  # --------------------------------------------------------
17
  # Copyright (c) 2025 NVIDIA
18
  # Licensed under customized NSCLv1 [see LICENSE.md for details]
19
  # --------------------------------------------------------
20
+ # Not a contribution
21
+ # Changes made by NVIDIA CORPORATION & AFFILIATES enabling llama-nemoretriever-colemebed models or otherwise documented as
22
+ # NSCLv1 are not a contribution and subject to the terms and conditions in LICENSE.md
23
+
24
+
25
  """ PyTorch Siglip model."""
26
 
27
 
results.json ADDED
@@ -0,0 +1,994 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "timestamp": "2025-06-26T06:05:28.223089",
4
+ "vidore_benchmark_version": "5.0.1.dev12+ge0eb903"
5
+ },
6
+ "metrics": {
7
+ "vidore/arxivqa_test_subsampled": {
8
+ "ndcg_at_1": 0.816,
9
+ "ndcg_at_3": 0.86359,
10
+ "ndcg_at_5": 0.87608,
11
+ "ndcg_at_10": 0.88392,
12
+ "ndcg_at_20": 0.8879,
13
+ "ndcg_at_50": 0.89151,
14
+ "ndcg_at_100": 0.89314,
15
+ "map_at_1": 0.816,
16
+ "map_at_3": 0.85233,
17
+ "map_at_5": 0.85933,
18
+ "map_at_10": 0.86262,
19
+ "map_at_20": 0.86367,
20
+ "map_at_50": 0.86427,
21
+ "map_at_100": 0.86442,
22
+ "recall_at_1": 0.816,
23
+ "recall_at_3": 0.896,
24
+ "recall_at_5": 0.926,
25
+ "recall_at_10": 0.95,
26
+ "recall_at_20": 0.966,
27
+ "recall_at_50": 0.984,
28
+ "recall_at_100": 0.994,
29
+ "precision_at_1": 0.816,
30
+ "precision_at_3": 0.29867,
31
+ "precision_at_5": 0.1852,
32
+ "precision_at_10": 0.095,
33
+ "precision_at_20": 0.0483,
34
+ "precision_at_50": 0.01968,
35
+ "precision_at_100": 0.00994,
36
+ "mrr_at_1": 0.816,
37
+ "mrr_at_3": 0.8523333333333332,
38
+ "mrr_at_5": 0.8593333333333332,
39
+ "mrr_at_10": 0.8626190476190474,
40
+ "mrr_at_20": 0.8636749323775637,
41
+ "mrr_at_50": 0.8642709526380625,
42
+ "mrr_at_100": 0.8644171478708726,
43
+ "naucs_at_1_max": 0.6293010404524454,
44
+ "naucs_at_1_std": 0.4072658981997148,
45
+ "naucs_at_1_diff1": 0.9408539481068943,
46
+ "naucs_at_3_max": 0.7068956182776297,
47
+ "naucs_at_3_std": 0.4853641810438182,
48
+ "naucs_at_3_diff1": 0.919151086407275,
49
+ "naucs_at_5_max": 0.6675372851843446,
50
+ "naucs_at_5_std": 0.48681454563807564,
51
+ "naucs_at_5_diff1": 0.9050521109344639,
52
+ "naucs_at_10_max": 0.671241830065359,
53
+ "naucs_at_10_std": 0.5544724556489257,
54
+ "naucs_at_10_diff1": 0.9300653594771261,
55
+ "naucs_at_20_max": 0.6462624265392473,
56
+ "naucs_at_20_std": 0.5665952655572045,
57
+ "naucs_at_20_diff1": 0.9288735101883935,
58
+ "naucs_at_50_max": 0.6601890756302559,
59
+ "naucs_at_50_std": 0.7135270774976739,
60
+ "naucs_at_50_diff1": 0.9509803921568729,
61
+ "naucs_at_100_max": 0.4788359788359729,
62
+ "naucs_at_100_std": 0.7860255213196177,
63
+ "naucs_at_100_diff1": 0.9128540305010608
64
+ },
65
+ "vidore/docvqa_test_subsampled": {
66
+ "ndcg_at_1": 0.55432,
67
+ "ndcg_at_3": 0.62271,
68
+ "ndcg_at_5": 0.6418,
69
+ "ndcg_at_10": 0.65981,
70
+ "ndcg_at_20": 0.67245,
71
+ "ndcg_at_50": 0.68717,
72
+ "ndcg_at_100": 0.69371,
73
+ "map_at_1": 0.55432,
74
+ "map_at_3": 0.60643,
75
+ "map_at_5": 0.61696,
76
+ "map_at_10": 0.62444,
77
+ "map_at_20": 0.62807,
78
+ "map_at_50": 0.63053,
79
+ "map_at_100": 0.63112,
80
+ "recall_at_1": 0.55432,
81
+ "recall_at_3": 0.66962,
82
+ "recall_at_5": 0.71619,
83
+ "recall_at_10": 0.77162,
84
+ "recall_at_20": 0.8204,
85
+ "recall_at_50": 0.89357,
86
+ "recall_at_100": 0.93348,
87
+ "precision_at_1": 0.55432,
88
+ "precision_at_3": 0.22321,
89
+ "precision_at_5": 0.14324,
90
+ "precision_at_10": 0.07716,
91
+ "precision_at_20": 0.04102,
92
+ "precision_at_50": 0.01787,
93
+ "precision_at_100": 0.00933,
94
+ "mrr_at_1": 0.5543237250554324,
95
+ "mrr_at_3": 0.6064301552106431,
96
+ "mrr_at_5": 0.6169623059866962,
97
+ "mrr_at_10": 0.6244403970013725,
98
+ "mrr_at_20": 0.6280746690046116,
99
+ "mrr_at_50": 0.6305269355853262,
100
+ "mrr_at_100": 0.6311233105469878,
101
+ "naucs_at_1_max": 0.26918057100652915,
102
+ "naucs_at_1_std": 0.36012926047126925,
103
+ "naucs_at_1_diff1": 0.8917667066190673,
104
+ "naucs_at_3_max": 0.1874727550722851,
105
+ "naucs_at_3_std": 0.3626474506990236,
106
+ "naucs_at_3_diff1": 0.8344987396644553,
107
+ "naucs_at_5_max": 0.15678371580205439,
108
+ "naucs_at_5_std": 0.3744247784871604,
109
+ "naucs_at_5_diff1": 0.8157913006713307,
110
+ "naucs_at_10_max": 0.12657768477167816,
111
+ "naucs_at_10_std": 0.39528373597721594,
112
+ "naucs_at_10_diff1": 0.8084378962022776,
113
+ "naucs_at_20_max": 0.07727578186474303,
114
+ "naucs_at_20_std": 0.4426250569829898,
115
+ "naucs_at_20_diff1": 0.7647475248422911,
116
+ "naucs_at_50_max": 0.05408134294887712,
117
+ "naucs_at_50_std": 0.6746820186486754,
118
+ "naucs_at_50_diff1": 0.731714700634661,
119
+ "naucs_at_100_max": 0.13603326841324073,
120
+ "naucs_at_100_std": 0.8784558171048236,
121
+ "naucs_at_100_diff1": 0.6908191020466624
122
+ },
123
+ "vidore/infovqa_test_subsampled": {
124
+ "ndcg_at_1": 0.91093,
125
+ "ndcg_at_3": 0.92978,
126
+ "ndcg_at_5": 0.9364,
127
+ "ndcg_at_10": 0.94263,
128
+ "ndcg_at_20": 0.94514,
129
+ "ndcg_at_50": 0.94634,
130
+ "ndcg_at_100": 0.94739,
131
+ "map_at_1": 0.91093,
132
+ "map_at_3": 0.9251,
133
+ "map_at_5": 0.92874,
134
+ "map_at_10": 0.93152,
135
+ "map_at_20": 0.93218,
136
+ "map_at_50": 0.93237,
137
+ "map_at_100": 0.93248,
138
+ "recall_at_1": 0.91093,
139
+ "recall_at_3": 0.94332,
140
+ "recall_at_5": 0.95951,
141
+ "recall_at_10": 0.97773,
142
+ "recall_at_20": 0.98785,
143
+ "recall_at_50": 0.99393,
144
+ "recall_at_100": 1.0,
145
+ "precision_at_1": 0.91093,
146
+ "precision_at_3": 0.31444,
147
+ "precision_at_5": 0.1919,
148
+ "precision_at_10": 0.09777,
149
+ "precision_at_20": 0.04939,
150
+ "precision_at_50": 0.01988,
151
+ "precision_at_100": 0.01,
152
+ "mrr_at_1": 0.9109311740890689,
153
+ "mrr_at_3": 0.9251012145748988,
154
+ "mrr_at_5": 0.9287449392712549,
155
+ "mrr_at_10": 0.9315162907268171,
156
+ "mrr_at_20": 0.9321792672411868,
157
+ "mrr_at_50": 0.9323702937480027,
158
+ "mrr_at_100": 0.9324824899427555,
159
+ "naucs_at_1_max": 0.5050488059362528,
160
+ "naucs_at_1_std": 0.24635244116889507,
161
+ "naucs_at_1_diff1": 0.9580692684878348,
162
+ "naucs_at_3_max": 0.4711422227428897,
163
+ "naucs_at_3_std": 0.23770030547234597,
164
+ "naucs_at_3_diff1": 0.9626852988386382,
165
+ "naucs_at_5_max": 0.6927616323055464,
166
+ "naucs_at_5_std": 0.45649645323646826,
167
+ "naucs_at_5_diff1": 0.9673496364838072,
168
+ "naucs_at_10_max": 0.7590524361659158,
169
+ "naucs_at_10_std": 0.529529634785233,
170
+ "naucs_at_10_diff1": 0.9762542810791351,
171
+ "naucs_at_20_max": 0.7830887900175995,
172
+ "naucs_at_20_std": 0.7611980314414127,
173
+ "naucs_at_20_diff1": 0.9782330909891938,
174
+ "naucs_at_50_max": 0.9564661819784259,
175
+ "naucs_at_50_std": 0.9074217540806789,
176
+ "naucs_at_50_diff1": 1.0,
177
+ "naucs_at_100_max": null,
178
+ "naucs_at_100_std": null,
179
+ "naucs_at_100_diff1": null
180
+ },
181
+ "vidore/tabfquad_test_subsampled": {
182
+ "ndcg_at_1": 0.88929,
183
+ "ndcg_at_3": 0.93699,
184
+ "ndcg_at_5": 0.94298,
185
+ "ndcg_at_10": 0.94649,
186
+ "ndcg_at_20": 0.94746,
187
+ "ndcg_at_50": 0.94746,
188
+ "ndcg_at_100": 0.94808,
189
+ "map_at_1": 0.88929,
190
+ "map_at_3": 0.92619,
191
+ "map_at_5": 0.92958,
192
+ "map_at_10": 0.93105,
193
+ "map_at_20": 0.93135,
194
+ "map_at_50": 0.93135,
195
+ "map_at_100": 0.93142,
196
+ "recall_at_1": 0.88929,
197
+ "recall_at_3": 0.96786,
198
+ "recall_at_5": 0.98214,
199
+ "recall_at_10": 0.99286,
200
+ "recall_at_20": 0.99643,
201
+ "recall_at_50": 0.99643,
202
+ "recall_at_100": 1.0,
203
+ "precision_at_1": 0.88929,
204
+ "precision_at_3": 0.32262,
205
+ "precision_at_5": 0.19643,
206
+ "precision_at_10": 0.09929,
207
+ "precision_at_20": 0.04982,
208
+ "precision_at_50": 0.01993,
209
+ "precision_at_100": 0.01,
210
+ "mrr_at_1": 0.8892857142857142,
211
+ "mrr_at_3": 0.9261904761904763,
212
+ "mrr_at_5": 0.9295833333333332,
213
+ "mrr_at_10": 0.9310501700680273,
214
+ "mrr_at_20": 0.9313477891156464,
215
+ "mrr_at_50": 0.9313477891156464,
216
+ "mrr_at_100": 0.9314164704343276,
217
+ "naucs_at_1_max": 0.11535645648394,
218
+ "naucs_at_1_std": 0.08977761508190756,
219
+ "naucs_at_1_diff1": 0.9270919129983833,
220
+ "naucs_at_3_max": 0.6139122315592889,
221
+ "naucs_at_3_std": 0.5563336445689361,
222
+ "naucs_at_3_diff1": 0.9564270152505452,
223
+ "naucs_at_5_max": 0.5654528478057843,
224
+ "naucs_at_5_std": 0.6578898225957153,
225
+ "naucs_at_5_diff1": 0.9738562091503306,
226
+ "naucs_at_10_max": 0.8611111111111035,
227
+ "naucs_at_10_std": 0.9346405228758269,
228
+ "naucs_at_10_diff1": 1.0,
229
+ "naucs_at_20_max": 1.0,
230
+ "naucs_at_20_std": 1.0,
231
+ "naucs_at_20_diff1": 1.0,
232
+ "naucs_at_50_max": 1.0,
233
+ "naucs_at_50_std": 1.0,
234
+ "naucs_at_50_diff1": 1.0,
235
+ "naucs_at_100_max": 1.0,
236
+ "naucs_at_100_std": 1.0,
237
+ "naucs_at_100_diff1": 1.0
238
+ },
239
+ "vidore/tatdqa_test": {
240
+ "ndcg_at_1": 0.69684,
241
+ "ndcg_at_3": 0.78254,
242
+ "ndcg_at_5": 0.79903,
243
+ "ndcg_at_10": 0.81433,
244
+ "ndcg_at_20": 0.82161,
245
+ "ndcg_at_50": 0.82553,
246
+ "ndcg_at_100": 0.82719,
247
+ "map_at_1": 0.69684,
248
+ "map_at_3": 0.76154,
249
+ "map_at_5": 0.77069,
250
+ "map_at_10": 0.77709,
251
+ "map_at_20": 0.7792,
252
+ "map_at_50": 0.77981,
253
+ "map_at_100": 0.77995,
254
+ "recall_at_1": 0.69684,
255
+ "recall_at_3": 0.84326,
256
+ "recall_at_5": 0.88335,
257
+ "recall_at_10": 0.93013,
258
+ "recall_at_20": 0.95808,
259
+ "recall_at_50": 0.97813,
260
+ "recall_at_100": 0.98846,
261
+ "precision_at_1": 0.69684,
262
+ "precision_at_3": 0.28109,
263
+ "precision_at_5": 0.17667,
264
+ "precision_at_10": 0.09301,
265
+ "precision_at_20": 0.0479,
266
+ "precision_at_50": 0.01956,
267
+ "precision_at_100": 0.00988,
268
+ "mrr_at_1": 0.695625759416768,
269
+ "mrr_at_3": 0.7609356014580799,
270
+ "mrr_at_5": 0.7699574726609966,
271
+ "mrr_at_10": 0.7763932284132767,
272
+ "mrr_at_20": 0.7785060539451517,
273
+ "mrr_at_50": 0.7791113338667969,
274
+ "mrr_at_100": 0.7792520714241775,
275
+ "naucs_at_1_max": 0.23477193022892978,
276
+ "naucs_at_1_std": 0.17073254873506194,
277
+ "naucs_at_1_diff1": 0.845390421248477,
278
+ "naucs_at_3_max": 0.21897649044161785,
279
+ "naucs_at_3_std": 0.23263205468737608,
280
+ "naucs_at_3_diff1": 0.7773076885537255,
281
+ "naucs_at_5_max": 0.22908169584132324,
282
+ "naucs_at_5_std": 0.25174255666871304,
283
+ "naucs_at_5_diff1": 0.7439197717296311,
284
+ "naucs_at_10_max": 0.23769949780387462,
285
+ "naucs_at_10_std": 0.29069564332383074,
286
+ "naucs_at_10_diff1": 0.7167658978463287,
287
+ "naucs_at_20_max": 0.21326491789377966,
288
+ "naucs_at_20_std": 0.26558383716997797,
289
+ "naucs_at_20_diff1": 0.6754815504474063,
290
+ "naucs_at_50_max": 0.12767303786085316,
291
+ "naucs_at_50_std": 0.24485452538521443,
292
+ "naucs_at_50_diff1": 0.6732815103835509,
293
+ "naucs_at_100_max": 0.14158104162792087,
294
+ "naucs_at_100_std": 0.324957585420732,
295
+ "naucs_at_100_diff1": 0.6752814674223621
296
+ },
297
+ "vidore/shiftproject_test": {
298
+ "ndcg_at_1": 0.84,
299
+ "ndcg_at_3": 0.9194,
300
+ "ndcg_at_5": 0.92327,
301
+ "ndcg_at_10": 0.9266,
302
+ "ndcg_at_20": 0.9266,
303
+ "ndcg_at_50": 0.9266,
304
+ "ndcg_at_100": 0.92821,
305
+ "map_at_1": 0.84,
306
+ "map_at_3": 0.90167,
307
+ "map_at_5": 0.90367,
308
+ "map_at_10": 0.9051,
309
+ "map_at_20": 0.9051,
310
+ "map_at_50": 0.9051,
311
+ "map_at_100": 0.90523,
312
+ "recall_at_1": 0.84,
313
+ "recall_at_3": 0.97,
314
+ "recall_at_5": 0.98,
315
+ "recall_at_10": 0.99,
316
+ "recall_at_20": 0.99,
317
+ "recall_at_50": 0.99,
318
+ "recall_at_100": 1.0,
319
+ "precision_at_1": 0.84,
320
+ "precision_at_3": 0.32333,
321
+ "precision_at_5": 0.196,
322
+ "precision_at_10": 0.099,
323
+ "precision_at_20": 0.0495,
324
+ "precision_at_50": 0.0198,
325
+ "precision_at_100": 0.01,
326
+ "mrr_at_1": 0.84,
327
+ "mrr_at_3": 0.9016666666666666,
328
+ "mrr_at_5": 0.9036666666666667,
329
+ "mrr_at_10": 0.9050952380952382,
330
+ "mrr_at_20": 0.9050952380952382,
331
+ "mrr_at_50": 0.9050952380952382,
332
+ "mrr_at_100": 0.905232224396608,
333
+ "naucs_at_1_max": 0.19096710849288218,
334
+ "naucs_at_1_std": -0.30166912125674944,
335
+ "naucs_at_1_diff1": 0.8265218458517417,
336
+ "naucs_at_3_max": -0.20401493930905265,
337
+ "naucs_at_3_std": -0.7268907563025196,
338
+ "naucs_at_3_diff1": 0.9564270152505466,
339
+ "naucs_at_5_max": -0.3674136321195164,
340
+ "naucs_at_5_std": -0.5144724556489195,
341
+ "naucs_at_5_diff1": 0.9346405228758136,
342
+ "naucs_at_10_max": -0.1713352007469681,
343
+ "naucs_at_10_std": 0.12278244631185926,
344
+ "naucs_at_10_diff1": 1.0,
345
+ "naucs_at_20_max": -0.1713352007469681,
346
+ "naucs_at_20_std": 0.12278244631185926,
347
+ "naucs_at_20_diff1": 1.0,
348
+ "naucs_at_50_max": -0.17133520074697067,
349
+ "naucs_at_50_std": 0.12278244631185525,
350
+ "naucs_at_50_diff1": 1.0,
351
+ "naucs_at_100_max": null,
352
+ "naucs_at_100_std": null,
353
+ "naucs_at_100_diff1": null
354
+ },
355
+ "vidore/syntheticDocQA_artificial_intelligence_test": {
356
+ "ndcg_at_1": 1.0,
357
+ "ndcg_at_3": 1.0,
358
+ "ndcg_at_5": 1.0,
359
+ "ndcg_at_10": 1.0,
360
+ "ndcg_at_20": 1.0,
361
+ "ndcg_at_50": 1.0,
362
+ "ndcg_at_100": 1.0,
363
+ "map_at_1": 1.0,
364
+ "map_at_3": 1.0,
365
+ "map_at_5": 1.0,
366
+ "map_at_10": 1.0,
367
+ "map_at_20": 1.0,
368
+ "map_at_50": 1.0,
369
+ "map_at_100": 1.0,
370
+ "recall_at_1": 1.0,
371
+ "recall_at_3": 1.0,
372
+ "recall_at_5": 1.0,
373
+ "recall_at_10": 1.0,
374
+ "recall_at_20": 1.0,
375
+ "recall_at_50": 1.0,
376
+ "recall_at_100": 1.0,
377
+ "precision_at_1": 1.0,
378
+ "precision_at_3": 0.33333,
379
+ "precision_at_5": 0.2,
380
+ "precision_at_10": 0.1,
381
+ "precision_at_20": 0.05,
382
+ "precision_at_50": 0.02,
383
+ "precision_at_100": 0.01,
384
+ "mrr_at_1": 1.0,
385
+ "mrr_at_3": 1.0,
386
+ "mrr_at_5": 1.0,
387
+ "mrr_at_10": 1.0,
388
+ "mrr_at_20": 1.0,
389
+ "mrr_at_50": 1.0,
390
+ "mrr_at_100": 1.0,
391
+ "naucs_at_1_max": null,
392
+ "naucs_at_1_std": null,
393
+ "naucs_at_1_diff1": null,
394
+ "naucs_at_3_max": 1.0,
395
+ "naucs_at_3_std": 1.0,
396
+ "naucs_at_3_diff1": 1.0,
397
+ "naucs_at_5_max": 1.0,
398
+ "naucs_at_5_std": 1.0,
399
+ "naucs_at_5_diff1": 1.0,
400
+ "naucs_at_10_max": 1.0,
401
+ "naucs_at_10_std": 1.0,
402
+ "naucs_at_10_diff1": 1.0,
403
+ "naucs_at_20_max": 1.0,
404
+ "naucs_at_20_std": 1.0,
405
+ "naucs_at_20_diff1": 1.0,
406
+ "naucs_at_50_max": null,
407
+ "naucs_at_50_std": null,
408
+ "naucs_at_50_diff1": null,
409
+ "naucs_at_100_max": null,
410
+ "naucs_at_100_std": null,
411
+ "naucs_at_100_diff1": null
412
+ },
413
+ "vidore/syntheticDocQA_energy_test": {
414
+ "ndcg_at_1": 0.96,
415
+ "ndcg_at_3": 0.96631,
416
+ "ndcg_at_5": 0.96631,
417
+ "ndcg_at_10": 0.97235,
418
+ "ndcg_at_20": 0.97235,
419
+ "ndcg_at_50": 0.97451,
420
+ "ndcg_at_100": 0.97451,
421
+ "map_at_1": 0.96,
422
+ "map_at_3": 0.965,
423
+ "map_at_5": 0.965,
424
+ "map_at_10": 0.96725,
425
+ "map_at_20": 0.96725,
426
+ "map_at_50": 0.96767,
427
+ "map_at_100": 0.96767,
428
+ "recall_at_1": 0.96,
429
+ "recall_at_3": 0.97,
430
+ "recall_at_5": 0.97,
431
+ "recall_at_10": 0.99,
432
+ "recall_at_20": 0.99,
433
+ "recall_at_50": 1.0,
434
+ "recall_at_100": 1.0,
435
+ "precision_at_1": 0.96,
436
+ "precision_at_3": 0.32333,
437
+ "precision_at_5": 0.194,
438
+ "precision_at_10": 0.099,
439
+ "precision_at_20": 0.0495,
440
+ "precision_at_50": 0.02,
441
+ "precision_at_100": 0.01,
442
+ "mrr_at_1": 0.96,
443
+ "mrr_at_3": 0.965,
444
+ "mrr_at_5": 0.965,
445
+ "mrr_at_10": 0.9672499999999999,
446
+ "mrr_at_20": 0.9672499999999999,
447
+ "mrr_at_50": 0.9676666666666667,
448
+ "mrr_at_100": 0.9676666666666667,
449
+ "naucs_at_1_max": 0.5671101774042947,
450
+ "naucs_at_1_std": -0.5088702147525661,
451
+ "naucs_at_1_diff1": 1.0,
452
+ "naucs_at_3_max": 0.7152194211017727,
453
+ "naucs_at_3_std": -0.09850606909430029,
454
+ "naucs_at_3_diff1": 1.0,
455
+ "naucs_at_5_max": 0.7152194211017747,
456
+ "naucs_at_5_std": -0.09850606909430323,
457
+ "naucs_at_5_diff1": 1.0,
458
+ "naucs_at_10_max": 0.8692810457516413,
459
+ "naucs_at_10_std": 0.7222222222222276,
460
+ "naucs_at_10_diff1": 1.0,
461
+ "naucs_at_20_max": 0.8692810457516413,
462
+ "naucs_at_20_std": 0.7222222222222276,
463
+ "naucs_at_20_diff1": 1.0,
464
+ "naucs_at_50_max": null,
465
+ "naucs_at_50_std": null,
466
+ "naucs_at_50_diff1": null,
467
+ "naucs_at_100_max": null,
468
+ "naucs_at_100_std": null,
469
+ "naucs_at_100_diff1": null
470
+ },
471
+ "vidore/syntheticDocQA_government_reports_test": {
472
+ "ndcg_at_1": 0.93,
473
+ "ndcg_at_3": 0.96655,
474
+ "ndcg_at_5": 0.96655,
475
+ "ndcg_at_10": 0.97011,
476
+ "ndcg_at_20": 0.97011,
477
+ "ndcg_at_50": 0.97011,
478
+ "ndcg_at_100": 0.97011,
479
+ "map_at_1": 0.93,
480
+ "map_at_3": 0.95833,
481
+ "map_at_5": 0.95833,
482
+ "map_at_10": 0.96,
483
+ "map_at_20": 0.96,
484
+ "map_at_50": 0.96,
485
+ "map_at_100": 0.96,
486
+ "recall_at_1": 0.93,
487
+ "recall_at_3": 0.99,
488
+ "recall_at_5": 0.99,
489
+ "recall_at_10": 1.0,
490
+ "recall_at_20": 1.0,
491
+ "recall_at_50": 1.0,
492
+ "recall_at_100": 1.0,
493
+ "precision_at_1": 0.93,
494
+ "precision_at_3": 0.33,
495
+ "precision_at_5": 0.198,
496
+ "precision_at_10": 0.1,
497
+ "precision_at_20": 0.05,
498
+ "precision_at_50": 0.02,
499
+ "precision_at_100": 0.01,
500
+ "mrr_at_1": 0.93,
501
+ "mrr_at_3": 0.9583333333333335,
502
+ "mrr_at_5": 0.9583333333333335,
503
+ "mrr_at_10": 0.96,
504
+ "mrr_at_20": 0.96,
505
+ "mrr_at_50": 0.96,
506
+ "mrr_at_100": 0.96,
507
+ "naucs_at_1_max": 0.771308523409364,
508
+ "naucs_at_1_std": 0.25456849406429166,
509
+ "naucs_at_1_diff1": 0.943977591036415,
510
+ "naucs_at_3_max": 1.0,
511
+ "naucs_at_3_std": 1.0,
512
+ "naucs_at_3_diff1": 1.0,
513
+ "naucs_at_5_max": 1.0,
514
+ "naucs_at_5_std": 1.0,
515
+ "naucs_at_5_diff1": 1.0,
516
+ "naucs_at_10_max": 1.0,
517
+ "naucs_at_10_std": 1.0,
518
+ "naucs_at_10_diff1": 1.0,
519
+ "naucs_at_20_max": 1.0,
520
+ "naucs_at_20_std": 1.0,
521
+ "naucs_at_20_diff1": 1.0,
522
+ "naucs_at_50_max": null,
523
+ "naucs_at_50_std": null,
524
+ "naucs_at_50_diff1": null,
525
+ "naucs_at_100_max": null,
526
+ "naucs_at_100_std": null,
527
+ "naucs_at_100_diff1": null
528
+ },
529
+ "vidore/syntheticDocQA_healthcare_industry_test": {
530
+ "ndcg_at_1": 0.99,
531
+ "ndcg_at_3": 0.99631,
532
+ "ndcg_at_5": 0.99631,
533
+ "ndcg_at_10": 0.99631,
534
+ "ndcg_at_20": 0.99631,
535
+ "ndcg_at_50": 0.99631,
536
+ "ndcg_at_100": 0.99631,
537
+ "map_at_1": 0.99,
538
+ "map_at_3": 0.995,
539
+ "map_at_5": 0.995,
540
+ "map_at_10": 0.995,
541
+ "map_at_20": 0.995,
542
+ "map_at_50": 0.995,
543
+ "map_at_100": 0.995,
544
+ "recall_at_1": 0.99,
545
+ "recall_at_3": 1.0,
546
+ "recall_at_5": 1.0,
547
+ "recall_at_10": 1.0,
548
+ "recall_at_20": 1.0,
549
+ "recall_at_50": 1.0,
550
+ "recall_at_100": 1.0,
551
+ "precision_at_1": 0.99,
552
+ "precision_at_3": 0.33333,
553
+ "precision_at_5": 0.2,
554
+ "precision_at_10": 0.1,
555
+ "precision_at_20": 0.05,
556
+ "precision_at_50": 0.02,
557
+ "precision_at_100": 0.01,
558
+ "mrr_at_1": 0.99,
559
+ "mrr_at_3": 0.995,
560
+ "mrr_at_5": 0.995,
561
+ "mrr_at_10": 0.995,
562
+ "mrr_at_20": 0.995,
563
+ "mrr_at_50": 0.995,
564
+ "mrr_at_100": 0.995,
565
+ "naucs_at_1_max": 0.7222222222222201,
566
+ "naucs_at_1_std": 1.0,
567
+ "naucs_at_1_diff1": 1.0,
568
+ "naucs_at_3_max": 1.0,
569
+ "naucs_at_3_std": 1.0,
570
+ "naucs_at_3_diff1": 1.0,
571
+ "naucs_at_5_max": 1.0,
572
+ "naucs_at_5_std": 1.0,
573
+ "naucs_at_5_diff1": 1.0,
574
+ "naucs_at_10_max": 1.0,
575
+ "naucs_at_10_std": 1.0,
576
+ "naucs_at_10_diff1": 1.0,
577
+ "naucs_at_20_max": 1.0,
578
+ "naucs_at_20_std": 1.0,
579
+ "naucs_at_20_diff1": 1.0,
580
+ "naucs_at_50_max": null,
581
+ "naucs_at_50_std": null,
582
+ "naucs_at_50_diff1": null,
583
+ "naucs_at_100_max": null,
584
+ "naucs_at_100_std": null,
585
+ "naucs_at_100_diff1": null
586
+ },
587
+ "vidore/synthetic_rse_restaurant_filtered_v1.0_multilingual": {
588
+ "ndcg_at_1": 0.5,
589
+ "ndcg_at_3": 0.51063,
590
+ "ndcg_at_5": 0.56843,
591
+ "ndcg_at_10": 0.62312,
592
+ "ndcg_at_20": 0.65401,
593
+ "ndcg_at_50": 0.67834,
594
+ "ndcg_at_100": 0.6907,
595
+ "map_at_1": 0.26391,
596
+ "map_at_3": 0.38408,
597
+ "map_at_5": 0.45528,
598
+ "map_at_10": 0.5042,
599
+ "map_at_20": 0.52273,
600
+ "map_at_50": 0.53634,
601
+ "map_at_100": 0.54254,
602
+ "recall_at_1": 0.26391,
603
+ "recall_at_3": 0.46643,
604
+ "recall_at_5": 0.61986,
605
+ "recall_at_10": 0.78311,
606
+ "recall_at_20": 0.87897,
607
+ "recall_at_50": 0.94043,
608
+ "recall_at_100": 0.97277,
609
+ "precision_at_1": 0.5,
610
+ "precision_at_3": 0.35088,
611
+ "precision_at_5": 0.30526,
612
+ "precision_at_10": 0.20702,
613
+ "precision_at_20": 0.12719,
614
+ "precision_at_50": 0.06561,
615
+ "precision_at_100": 0.03789,
616
+ "mrr_at_1": 0.5,
617
+ "mrr_at_3": 0.6089181286549706,
618
+ "mrr_at_5": 0.6317251461988304,
619
+ "mrr_at_10": 0.641208925090504,
620
+ "mrr_at_20": 0.6448876756055548,
621
+ "mrr_at_50": 0.6448876756055548,
622
+ "mrr_at_100": 0.6450109692219045,
623
+ "naucs_at_1_max": 0.06266633655005126,
624
+ "naucs_at_1_std": 0.10328247420677374,
625
+ "naucs_at_1_diff1": 0.32364592144068777,
626
+ "naucs_at_3_max": -0.027204152815788592,
627
+ "naucs_at_3_std": 0.10728940302719199,
628
+ "naucs_at_3_diff1": 0.24109996679515233,
629
+ "naucs_at_5_max": -0.08818416172256802,
630
+ "naucs_at_5_std": 0.06933352726164131,
631
+ "naucs_at_5_diff1": 0.1346725342215723,
632
+ "naucs_at_10_max": -0.1634891544078435,
633
+ "naucs_at_10_std": -0.07449668399775349,
634
+ "naucs_at_10_diff1": 0.06840639837870822,
635
+ "naucs_at_20_max": -0.23984204721411798,
636
+ "naucs_at_20_std": -0.17420250142740507,
637
+ "naucs_at_20_diff1": -0.03237996351004239,
638
+ "naucs_at_50_max": -0.25657441933802366,
639
+ "naucs_at_50_std": -0.21481884062099896,
640
+ "naucs_at_50_diff1": -0.11144862322897976,
641
+ "naucs_at_100_max": -0.26954674492082575,
642
+ "naucs_at_100_std": -0.22275247242806823,
643
+ "naucs_at_100_diff1": -0.12549423492642411
644
+ },
645
+ "vidore/synthetic_mit_biomedical_tissue_interactions_unfiltered_multilingual": {
646
+ "ndcg_at_1": 0.61875,
647
+ "ndcg_at_3": 0.60842,
648
+ "ndcg_at_5": 0.62342,
649
+ "ndcg_at_10": 0.65234,
650
+ "ndcg_at_20": 0.67684,
651
+ "ndcg_at_50": 0.70181,
652
+ "ndcg_at_100": 0.71529,
653
+ "map_at_1": 0.37744,
654
+ "map_at_3": 0.49938,
655
+ "map_at_5": 0.53387,
656
+ "map_at_10": 0.56442,
657
+ "map_at_20": 0.57848,
658
+ "map_at_50": 0.58719,
659
+ "map_at_100": 0.59025,
660
+ "recall_at_1": 0.37744,
661
+ "recall_at_3": 0.56678,
662
+ "recall_at_5": 0.64475,
663
+ "recall_at_10": 0.73598,
664
+ "recall_at_20": 0.80296,
665
+ "recall_at_50": 0.88095,
666
+ "recall_at_100": 0.93199,
667
+ "precision_at_1": 0.61875,
668
+ "precision_at_3": 0.37188,
669
+ "precision_at_5": 0.27312,
670
+ "precision_at_10": 0.17187,
671
+ "precision_at_20": 0.10141,
672
+ "precision_at_50": 0.04884,
673
+ "precision_at_100": 0.02719,
674
+ "mrr_at_1": 0.61875,
675
+ "mrr_at_3": 0.699739583333333,
676
+ "mrr_at_5": 0.7123958333333327,
677
+ "mrr_at_10": 0.7202827380952376,
678
+ "mrr_at_20": 0.7221802283704852,
679
+ "mrr_at_50": 0.7230994356972297,
680
+ "mrr_at_100": 0.7232405617307813,
681
+ "naucs_at_1_max": 0.165285331951999,
682
+ "naucs_at_1_std": 0.034224006446228576,
683
+ "naucs_at_1_diff1": 0.4806948881022956,
684
+ "naucs_at_3_max": 0.024453257298345382,
685
+ "naucs_at_3_std": -0.04382987653963637,
686
+ "naucs_at_3_diff1": -0.031119586246158605,
687
+ "naucs_at_5_max": 0.0044058378732677785,
688
+ "naucs_at_5_std": -0.057741699699710346,
689
+ "naucs_at_5_diff1": -0.10405546973817163,
690
+ "naucs_at_10_max": -0.053355219862296764,
691
+ "naucs_at_10_std": -0.0790732524974529,
692
+ "naucs_at_10_diff1": -0.19731155024816296,
693
+ "naucs_at_20_max": -0.08441971461943433,
694
+ "naucs_at_20_std": -0.08182264243833959,
695
+ "naucs_at_20_diff1": -0.2493971279113114,
696
+ "naucs_at_50_max": -0.09688829418144233,
697
+ "naucs_at_50_std": -0.0658329150011907,
698
+ "naucs_at_50_diff1": -0.3027368345306483,
699
+ "naucs_at_100_max": -0.10823541191893869,
700
+ "naucs_at_100_std": -0.0945104052898891,
701
+ "naucs_at_100_diff1": -0.3316920486637138
702
+ },
703
+ "vidore/synthetics_economics_macro_economy_2024_filtered_v1.0_multilingual": {
704
+ "ndcg_at_1": 0.62931,
705
+ "ndcg_at_3": 0.59513,
706
+ "ndcg_at_5": 0.56377,
707
+ "ndcg_at_10": 0.56053,
708
+ "ndcg_at_20": 0.58335,
709
+ "ndcg_at_50": 0.65782,
710
+ "ndcg_at_100": 0.69264,
711
+ "map_at_1": 0.09011,
712
+ "map_at_3": 0.19282,
713
+ "map_at_5": 0.24036,
714
+ "map_at_10": 0.31499,
715
+ "map_at_20": 0.36938,
716
+ "map_at_50": 0.43088,
717
+ "map_at_100": 0.45631,
718
+ "recall_at_1": 0.09011,
719
+ "recall_at_3": 0.24556,
720
+ "recall_at_5": 0.31701,
721
+ "recall_at_10": 0.45534,
722
+ "recall_at_20": 0.58907,
723
+ "recall_at_50": 0.80132,
724
+ "recall_at_100": 0.90565,
725
+ "precision_at_1": 0.62931,
726
+ "precision_at_3": 0.54598,
727
+ "precision_at_5": 0.48276,
728
+ "precision_at_10": 0.40216,
729
+ "precision_at_20": 0.30345,
730
+ "precision_at_50": 0.19431,
731
+ "precision_at_100": 0.12522,
732
+ "mrr_at_1": 0.6293103448275862,
733
+ "mrr_at_3": 0.7471264367816095,
734
+ "mrr_at_5": 0.7579022988505749,
735
+ "mrr_at_10": 0.7618944991789821,
736
+ "mrr_at_20": 0.7634015268231467,
737
+ "mrr_at_50": 0.7639494630822898,
738
+ "mrr_at_100": 0.7639494630822898,
739
+ "naucs_at_1_max": 0.04993907976795083,
740
+ "naucs_at_1_std": 0.18482267813483494,
741
+ "naucs_at_1_diff1": 0.2858794410962534,
742
+ "naucs_at_3_max": -0.08514210036309953,
743
+ "naucs_at_3_std": 0.11808105150508759,
744
+ "naucs_at_3_diff1": 0.027094035130762372,
745
+ "naucs_at_5_max": -0.004367985895311936,
746
+ "naucs_at_5_std": 0.15698404005850908,
747
+ "naucs_at_5_diff1": -0.02650835347778693,
748
+ "naucs_at_10_max": -0.02264677824492615,
749
+ "naucs_at_10_std": 0.1395283970615093,
750
+ "naucs_at_10_diff1": -0.0636982012078006,
751
+ "naucs_at_20_max": -0.046111448743688874,
752
+ "naucs_at_20_std": 0.08818519563045128,
753
+ "naucs_at_20_diff1": -0.10045481147941789,
754
+ "naucs_at_50_max": -0.031747430051497286,
755
+ "naucs_at_50_std": 0.07071502869980983,
756
+ "naucs_at_50_diff1": -0.14000365109174984,
757
+ "naucs_at_100_max": -0.066374733897631,
758
+ "naucs_at_100_std": -0.013446247932639044,
759
+ "naucs_at_100_diff1": -0.16056962894086707
760
+ },
761
+ "vidore/restaurant_esg_reports_beir": {
762
+ "ndcg_at_1": 0.72436,
763
+ "ndcg_at_3": 0.75198,
764
+ "ndcg_at_5": 0.76869,
765
+ "ndcg_at_10": 0.79801,
766
+ "ndcg_at_20": 0.81428,
767
+ "ndcg_at_50": 0.82799,
768
+ "ndcg_at_100": 0.8295,
769
+ "map_at_1": 0.51909,
770
+ "map_at_3": 0.66566,
771
+ "map_at_5": 0.70632,
772
+ "map_at_10": 0.73195,
773
+ "map_at_20": 0.73946,
774
+ "map_at_50": 0.74654,
775
+ "map_at_100": 0.74697,
776
+ "recall_at_1": 0.51909,
777
+ "recall_at_3": 0.73545,
778
+ "recall_at_5": 0.80596,
779
+ "recall_at_10": 0.8816,
780
+ "recall_at_20": 0.92428,
781
+ "recall_at_50": 0.96885,
782
+ "recall_at_100": 0.97445,
783
+ "precision_at_1": 0.75,
784
+ "precision_at_3": 0.41026,
785
+ "precision_at_5": 0.3,
786
+ "precision_at_10": 0.17308,
787
+ "precision_at_20": 0.09712,
788
+ "precision_at_50": 0.04462,
789
+ "precision_at_100": 0.02269,
790
+ "mrr_at_1": 0.75,
791
+ "mrr_at_3": 0.8333333333333334,
792
+ "mrr_at_5": 0.8410256410256411,
793
+ "mrr_at_10": 0.8442307692307693,
794
+ "mrr_at_20": 0.845979020979021,
795
+ "mrr_at_50": 0.845979020979021,
796
+ "mrr_at_100": 0.845979020979021,
797
+ "naucs_at_1_max": 0.24468166513237982,
798
+ "naucs_at_1_std": 0.19805637553820268,
799
+ "naucs_at_1_diff1": 0.44490949154735066,
800
+ "naucs_at_3_max": 0.04409608181340947,
801
+ "naucs_at_3_std": 0.12191586031897299,
802
+ "naucs_at_3_diff1": -0.20170920044109023,
803
+ "naucs_at_5_max": -0.10046869346778849,
804
+ "naucs_at_5_std": -0.021466277530853978,
805
+ "naucs_at_5_diff1": -0.23631379826339902,
806
+ "naucs_at_10_max": -0.11031194559055651,
807
+ "naucs_at_10_std": -0.007349632141057487,
808
+ "naucs_at_10_diff1": -0.1954112809364078,
809
+ "naucs_at_20_max": -0.08664502573520873,
810
+ "naucs_at_20_std": 0.0007084915425306685,
811
+ "naucs_at_20_diff1": -0.3188566325096874,
812
+ "naucs_at_50_max": -0.13648898619435715,
813
+ "naucs_at_50_std": -0.04766895429454103,
814
+ "naucs_at_50_diff1": -0.3833929317006285,
815
+ "naucs_at_100_max": -0.15092442191928396,
816
+ "naucs_at_100_std": -0.0640639845215107,
817
+ "naucs_at_100_diff1": -0.3888345979263681
818
+ },
819
+ "vidore/synthetic_rse_restaurant_filtered_v1.0": {
820
+ "ndcg_at_1": 0.49123,
821
+ "ndcg_at_3": 0.51166,
822
+ "ndcg_at_5": 0.57093,
823
+ "ndcg_at_10": 0.62246,
824
+ "ndcg_at_20": 0.64573,
825
+ "ndcg_at_50": 0.67302,
826
+ "ndcg_at_100": 0.68379,
827
+ "map_at_1": 0.26696,
828
+ "map_at_3": 0.38476,
829
+ "map_at_5": 0.4576,
830
+ "map_at_10": 0.50379,
831
+ "map_at_20": 0.51794,
832
+ "map_at_50": 0.5326,
833
+ "map_at_100": 0.53858,
834
+ "recall_at_1": 0.26696,
835
+ "recall_at_3": 0.47153,
836
+ "recall_at_5": 0.64025,
837
+ "recall_at_10": 0.79041,
838
+ "recall_at_20": 0.86345,
839
+ "recall_at_50": 0.92982,
840
+ "recall_at_100": 0.95175,
841
+ "precision_at_1": 0.49123,
842
+ "precision_at_3": 0.35673,
843
+ "precision_at_5": 0.30526,
844
+ "precision_at_10": 0.20702,
845
+ "precision_at_20": 0.12193,
846
+ "precision_at_50": 0.06456,
847
+ "precision_at_100": 0.03737,
848
+ "mrr_at_1": 0.49122807017543857,
849
+ "mrr_at_3": 0.5964912280701753,
850
+ "mrr_at_5": 0.6298245614035087,
851
+ "mrr_at_10": 0.6337719298245614,
852
+ "mrr_at_20": 0.6367131062951495,
853
+ "mrr_at_50": 0.6367131062951495,
854
+ "mrr_at_100": 0.6367131062951495,
855
+ "naucs_at_1_max": 0.09118541685587986,
856
+ "naucs_at_1_std": -0.008192518221940852,
857
+ "naucs_at_1_diff1": 0.25538152453048424,
858
+ "naucs_at_3_max": -0.07082338888627057,
859
+ "naucs_at_3_std": 0.054108546626844174,
860
+ "naucs_at_3_diff1": 0.2145148016267093,
861
+ "naucs_at_5_max": -0.09990587583985085,
862
+ "naucs_at_5_std": 0.15901900396869725,
863
+ "naucs_at_5_diff1": 0.1903535434391372,
864
+ "naucs_at_10_max": -0.2542485342376898,
865
+ "naucs_at_10_std": -0.04845015739249387,
866
+ "naucs_at_10_diff1": 0.0742124569628327,
867
+ "naucs_at_20_max": -0.35959572979012006,
868
+ "naucs_at_20_std": -0.16398049550481905,
869
+ "naucs_at_20_diff1": -0.02970406796838026,
870
+ "naucs_at_50_max": -0.39280580078248756,
871
+ "naucs_at_50_std": -0.23497206052790093,
872
+ "naucs_at_50_diff1": -0.13254644231801654,
873
+ "naucs_at_100_max": -0.39460593340198014,
874
+ "naucs_at_100_std": -0.24027067078494477,
875
+ "naucs_at_100_diff1": -0.14366966729581043
876
+ },
877
+ "vidore/synthetic_economics_macro_economy_2024_filtered_v1.0": {
878
+ "ndcg_at_1": 0.74138,
879
+ "ndcg_at_3": 0.69889,
880
+ "ndcg_at_5": 0.64079,
881
+ "ndcg_at_10": 0.6271,
882
+ "ndcg_at_20": 0.63799,
883
+ "ndcg_at_50": 0.70863,
884
+ "ndcg_at_100": 0.74161,
885
+ "map_at_1": 0.11186,
886
+ "map_at_3": 0.23141,
887
+ "map_at_5": 0.28257,
888
+ "map_at_10": 0.3717,
889
+ "map_at_20": 0.42631,
890
+ "map_at_50": 0.49122,
891
+ "map_at_100": 0.51792,
892
+ "recall_at_1": 0.11186,
893
+ "recall_at_3": 0.27369,
894
+ "recall_at_5": 0.33748,
895
+ "recall_at_10": 0.48707,
896
+ "recall_at_20": 0.62067,
897
+ "recall_at_50": 0.8278,
898
+ "recall_at_100": 0.92727,
899
+ "precision_at_1": 0.74138,
900
+ "precision_at_3": 0.64943,
901
+ "precision_at_5": 0.54483,
902
+ "precision_at_10": 0.44655,
903
+ "precision_at_20": 0.32241,
904
+ "precision_at_50": 0.2031,
905
+ "precision_at_100": 0.13,
906
+ "mrr_at_1": 0.7413793103448276,
907
+ "mrr_at_3": 0.8189655172413793,
908
+ "mrr_at_5": 0.8224137931034482,
909
+ "mrr_at_10": 0.8252873563218389,
910
+ "mrr_at_20": 0.8293304396752672,
911
+ "mrr_at_50": 0.8293304396752672,
912
+ "mrr_at_100": 0.8293304396752672,
913
+ "naucs_at_1_max": 0.35838166878631555,
914
+ "naucs_at_1_std": 0.40425935610791647,
915
+ "naucs_at_1_diff1": 0.31196864342648645,
916
+ "naucs_at_3_max": -0.020321743603039347,
917
+ "naucs_at_3_std": 0.16664971950540372,
918
+ "naucs_at_3_diff1": -0.08724296354027215,
919
+ "naucs_at_5_max": -0.032106566274432695,
920
+ "naucs_at_5_std": 0.11470154885158665,
921
+ "naucs_at_5_diff1": -0.10263779775029133,
922
+ "naucs_at_10_max": -0.014430066686303686,
923
+ "naucs_at_10_std": 0.10566168354354433,
924
+ "naucs_at_10_diff1": -0.13830032198180964,
925
+ "naucs_at_20_max": 0.048789989832314584,
926
+ "naucs_at_20_std": 0.10940109965860004,
927
+ "naucs_at_20_diff1": -0.05835957874375413,
928
+ "naucs_at_50_max": -0.01769846400366676,
929
+ "naucs_at_50_std": 0.03309578073622774,
930
+ "naucs_at_50_diff1": -0.08590608452176329,
931
+ "naucs_at_100_max": -0.10067244123611839,
932
+ "naucs_at_100_std": -0.07858915244153995,
933
+ "naucs_at_100_diff1": -0.10852488217820669
934
+ },
935
+ "vidore/synthetic_mit_biomedical_tissue_interactions_unfiltered": {
936
+ "ndcg_at_1": 0.6625,
937
+ "ndcg_at_3": 0.64067,
938
+ "ndcg_at_5": 0.64653,
939
+ "ndcg_at_10": 0.68332,
940
+ "ndcg_at_20": 0.70843,
941
+ "ndcg_at_50": 0.72885,
942
+ "ndcg_at_100": 0.74088,
943
+ "map_at_1": 0.39814,
944
+ "map_at_3": 0.52723,
945
+ "map_at_5": 0.55866,
946
+ "map_at_10": 0.59555,
947
+ "map_at_20": 0.60975,
948
+ "map_at_50": 0.61739,
949
+ "map_at_100": 0.62024,
950
+ "recall_at_1": 0.39814,
951
+ "recall_at_3": 0.58754,
952
+ "recall_at_5": 0.65552,
953
+ "recall_at_10": 0.77079,
954
+ "recall_at_20": 0.83799,
955
+ "recall_at_50": 0.9003,
956
+ "recall_at_100": 0.94656,
957
+ "precision_at_1": 0.6625,
958
+ "precision_at_3": 0.39583,
959
+ "precision_at_5": 0.28375,
960
+ "precision_at_10": 0.18188,
961
+ "precision_at_20": 0.1075,
962
+ "precision_at_50": 0.05025,
963
+ "precision_at_100": 0.02756,
964
+ "mrr_at_1": 0.6625,
965
+ "mrr_at_3": 0.7291666666666665,
966
+ "mrr_at_5": 0.7382291666666665,
967
+ "mrr_at_10": 0.747906746031746,
968
+ "mrr_at_20": 0.7496362257024021,
969
+ "mrr_at_50": 0.7503963388309167,
970
+ "mrr_at_100": 0.7506175950965809,
971
+ "naucs_at_1_max": 0.41381934165815426,
972
+ "naucs_at_1_std": 0.14434322166399152,
973
+ "naucs_at_1_diff1": 0.47828806103942106,
974
+ "naucs_at_3_max": -0.02337866180796834,
975
+ "naucs_at_3_std": -0.092408589615909,
976
+ "naucs_at_3_diff1": -0.04358601319008129,
977
+ "naucs_at_5_max": -0.07407943615022797,
978
+ "naucs_at_5_std": -0.12635065467026393,
979
+ "naucs_at_5_diff1": -0.13429921326572394,
980
+ "naucs_at_10_max": -0.07844324279953604,
981
+ "naucs_at_10_std": -0.03584071250474527,
982
+ "naucs_at_10_diff1": -0.21034229778676125,
983
+ "naucs_at_20_max": -0.14896322731721384,
984
+ "naucs_at_20_std": -0.06855841147784922,
985
+ "naucs_at_20_diff1": -0.2730638665559868,
986
+ "naucs_at_50_max": -0.19266770277649403,
987
+ "naucs_at_50_std": -0.10702566472046023,
988
+ "naucs_at_50_diff1": -0.3440668710883354,
989
+ "naucs_at_100_max": -0.23435400494053177,
990
+ "naucs_at_100_std": -0.15235408708713993,
991
+ "naucs_at_100_diff1": -0.3875515616656985
992
+ }
993
+ }
994
+ }
vidore_eval.py CHANGED
@@ -36,6 +36,12 @@ def get_args():
36
  help='Path to model checkpoint if HF',
37
  default=''
38
  )
 
 
 
 
 
 
39
  parser.add_argument(
40
  '--batch_size',
41
  type=int,
@@ -74,7 +80,8 @@ if __name__ == "__main__":
74
  device_map='cuda',
75
  trust_remote_code=True,
76
  torch_dtype=torch.bfloat16,
77
- attn_implementation="flash_attention_2"
 
78
  ).eval()
79
 
80
  vidore_evaluator_qa = ViDoReEvaluatorQA(vision_retriever) # ViDoRe-v1
 
36
  help='Path to model checkpoint if HF',
37
  default=''
38
  )
39
+ parser.add_argument(
40
+ '--model_revision',
41
+ type=str,
42
+ help='Commit Hash of the model as custom code is downloaded and executed',
43
+ default=None
44
+ )
45
  parser.add_argument(
46
  '--batch_size',
47
  type=int,
 
80
  device_map='cuda',
81
  trust_remote_code=True,
82
  torch_dtype=torch.bfloat16,
83
+ attn_implementation="flash_attention_2",
84
+ revision=args.model_revision
85
  ).eval()
86
 
87
  vidore_evaluator_qa = ViDoReEvaluatorQA(vision_retriever) # ViDoRe-v1