Upload fine-tuned food embeddings with improved score distribution
Browse files- README.md +49 -44
- model.safetensors +1 -1
README.md
CHANGED
@@ -4,35 +4,35 @@ tags:
|
|
4 |
- sentence-similarity
|
5 |
- feature-extraction
|
6 |
- generated_from_trainer
|
7 |
-
- dataset_size:
|
8 |
- loss:ScaledCosineSimilarityLoss
|
9 |
base_model: sentence-transformers/all-MiniLM-L6-v2
|
10 |
widget:
|
11 |
-
- source_sentence:
|
12 |
sentences:
|
13 |
-
-
|
14 |
-
-
|
15 |
-
-
|
16 |
-
- source_sentence:
|
17 |
sentences:
|
18 |
-
-
|
19 |
-
-
|
20 |
-
-
|
21 |
-
- source_sentence:
|
22 |
sentences:
|
23 |
-
-
|
24 |
-
-
|
25 |
-
-
|
26 |
-
- source_sentence:
|
27 |
sentences:
|
28 |
-
-
|
29 |
-
-
|
30 |
-
-
|
31 |
-
- source_sentence:
|
32 |
sentences:
|
33 |
-
-
|
34 |
-
-
|
35 |
-
-
|
36 |
pipeline_tag: sentence-similarity
|
37 |
library_name: sentence-transformers
|
38 |
metrics:
|
@@ -49,10 +49,10 @@ model-index:
|
|
49 |
type: validation
|
50 |
metrics:
|
51 |
- type: pearson_cosine
|
52 |
-
value: 0.
|
53 |
name: Pearson Cosine
|
54 |
- type: spearman_cosine
|
55 |
-
value: 0.
|
56 |
name: Spearman Cosine
|
57 |
---
|
58 |
|
@@ -106,9 +106,9 @@ from sentence_transformers import SentenceTransformer
|
|
106 |
model = SentenceTransformer("jonny9f/food_embeddings4")
|
107 |
# Run inference
|
108 |
sentences = [
|
109 |
-
'
|
110 |
-
'
|
111 |
-
'
|
112 |
]
|
113 |
embeddings = model.encode(sentences)
|
114 |
print(embeddings.shape)
|
@@ -155,8 +155,8 @@ You can finetune this model on your own dataset.
|
|
155 |
|
156 |
| Metric | Value |
|
157 |
|:--------------------|:-----------|
|
158 |
-
| pearson_cosine | 0.
|
159 |
-
| **spearman_cosine** | **0.
|
160 |
|
161 |
<!--
|
162 |
## Bias, Risks and Limitations
|
@@ -177,19 +177,19 @@ You can finetune this model on your own dataset.
|
|
177 |
#### Unnamed Dataset
|
178 |
|
179 |
|
180 |
-
* Size:
|
181 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
182 |
* Approximate statistics based on the first 1000 samples:
|
183 |
| | sentence_0 | sentence_1 | label |
|
184 |
|:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------|
|
185 |
| type | string | string | float |
|
186 |
-
| details | <ul><li>min: 3 tokens</li><li>mean: 9.
|
187 |
* Samples:
|
188 |
-
| sentence_0
|
189 |
-
|
190 |
-
| <code>
|
191 |
-
| <code>
|
192 |
-
| <code>
|
193 |
* Loss: <code>__main__.ScaledCosineSimilarityLoss</code>
|
194 |
|
195 |
### Training Hyperparameters
|
@@ -324,15 +324,20 @@ You can finetune this model on your own dataset.
|
|
324 |
### Training Logs
|
325 |
| Epoch | Step | Training Loss | validation_spearman_cosine |
|
326 |
|:------:|:----:|:-------------:|:--------------------------:|
|
327 |
-
| 0.
|
328 |
-
| 0.
|
329 |
-
| 0.
|
330 |
-
| 0.
|
331 |
-
| 0.
|
332 |
-
| 0.
|
333 |
-
| 0.
|
334 |
-
| 0.
|
335 |
-
|
|
|
|
|
|
|
|
|
|
|
|
336 |
|
337 |
|
338 |
### Framework Versions
|
|
|
4 |
- sentence-similarity
|
5 |
- feature-extraction
|
6 |
- generated_from_trainer
|
7 |
+
- dataset_size:210328
|
8 |
- loss:ScaledCosineSimilarityLoss
|
9 |
base_model: sentence-transformers/all-MiniLM-L6-v2
|
10 |
widget:
|
11 |
+
- source_sentence: Whale, bowhead oil
|
12 |
sentences:
|
13 |
+
- Cheese, American processed with vitamin D
|
14 |
+
- Cashews, dry roasted with salt
|
15 |
+
- Salmon, dried chum
|
16 |
+
- source_sentence: acipenser naccarii bonaparte 1836
|
17 |
sentences:
|
18 |
+
- acipenser naccarii bonaparte, 1836
|
19 |
+
- butter clam
|
20 |
+
- Wild Rice, raw
|
21 |
+
- source_sentence: Granola Bar, Nature Valley Chewy Trail Mix
|
22 |
sentences:
|
23 |
+
- Sea lion meat, cooked (Alaska Native)
|
24 |
+
- Trail Mix, regular unsalted
|
25 |
+
- Soup, chunky vegetable, reduced sodium
|
26 |
+
- source_sentence: Bear Meat, polar raw
|
27 |
sentences:
|
28 |
+
- Tea, tundra herb and Labrador blend
|
29 |
+
- Beef rib, small end, choice, cooked roasted
|
30 |
+
- sudan teak
|
31 |
+
- source_sentence: Beef, tenderloin, raw
|
32 |
sentences:
|
33 |
+
- Pork tenderloin, raw
|
34 |
+
- Fruit salad, tropical canned in heavy syrup
|
35 |
+
- Lamb leg, whole, raw
|
36 |
pipeline_tag: sentence-similarity
|
37 |
library_name: sentence-transformers
|
38 |
metrics:
|
|
|
49 |
type: validation
|
50 |
metrics:
|
51 |
- type: pearson_cosine
|
52 |
+
value: 0.8074689102281711
|
53 |
name: Pearson Cosine
|
54 |
- type: spearman_cosine
|
55 |
+
value: 0.7665455013117164
|
56 |
name: Spearman Cosine
|
57 |
---
|
58 |
|
|
|
106 |
model = SentenceTransformer("jonny9f/food_embeddings4")
|
107 |
# Run inference
|
108 |
sentences = [
|
109 |
+
'Beef, tenderloin, raw',
|
110 |
+
'Pork tenderloin, raw',
|
111 |
+
'Lamb leg, whole, raw',
|
112 |
]
|
113 |
embeddings = model.encode(sentences)
|
114 |
print(embeddings.shape)
|
|
|
155 |
|
156 |
| Metric | Value |
|
157 |
|:--------------------|:-----------|
|
158 |
+
| pearson_cosine | 0.8075 |
|
159 |
+
| **spearman_cosine** | **0.7665** |
|
160 |
|
161 |
<!--
|
162 |
## Bias, Risks and Limitations
|
|
|
177 |
#### Unnamed Dataset
|
178 |
|
179 |
|
180 |
+
* Size: 210,328 training samples
|
181 |
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
182 |
* Approximate statistics based on the first 1000 samples:
|
183 |
| | sentence_0 | sentence_1 | label |
|
184 |
|:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------|
|
185 |
| type | string | string | float |
|
186 |
+
| details | <ul><li>min: 3 tokens</li><li>mean: 9.04 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 9.19 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.38</li><li>mean: 0.72</li><li>max: 1.0</li></ul> |
|
187 |
* Samples:
|
188 |
+
| sentence_0 | sentence_1 | label |
|
189 |
+
|:------------------------------------------------|:--------------------------------------------------|:---------------------------------|
|
190 |
+
| <code>Tortilla, plain or mutton sandwich</code> | <code>Roast beef sandwich, plain</code> | <code>0.42789756059646605</code> |
|
191 |
+
| <code>Lamb rib, cooked roasted</code> | <code>Lamb, leg shank half, cooked roasted</code> | <code>0.7156221181154251</code> |
|
192 |
+
| <code>red raspberry plant</code> | <code>rubus idaeus var. idaeus l.</code> | <code>0.8826086956521739</code> |
|
193 |
* Loss: <code>__main__.ScaledCosineSimilarityLoss</code>
|
194 |
|
195 |
### Training Hyperparameters
|
|
|
324 |
### Training Logs
|
325 |
| Epoch | Step | Training Loss | validation_spearman_cosine |
|
326 |
|:------:|:----:|:-------------:|:--------------------------:|
|
327 |
+
| 0.0761 | 500 | 0.0179 | - |
|
328 |
+
| 0.1521 | 1000 | 0.0067 | - |
|
329 |
+
| 0.2282 | 1500 | 0.0059 | - |
|
330 |
+
| 0.3043 | 2000 | 0.0051 | - |
|
331 |
+
| 0.3803 | 2500 | 0.0048 | - |
|
332 |
+
| 0.4564 | 3000 | 0.0046 | - |
|
333 |
+
| 0.5325 | 3500 | 0.0043 | - |
|
334 |
+
| 0.6086 | 4000 | 0.004 | - |
|
335 |
+
| 0.6846 | 4500 | 0.0038 | - |
|
336 |
+
| 0.7607 | 5000 | 0.0037 | - |
|
337 |
+
| 0.8368 | 5500 | 0.0037 | - |
|
338 |
+
| 0.9128 | 6000 | 0.0034 | - |
|
339 |
+
| 0.9889 | 6500 | 0.0033 | - |
|
340 |
+
| 1.0 | 6573 | - | 0.7665 |
|
341 |
|
342 |
|
343 |
### Framework Versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 90864192
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:31e5917015fa832c3eb1819a92e73c59337a0d7a52b6c19765354e2fbbde837d
|
3 |
size 90864192
|