jonny9f commited on
Commit
f2a6e92
·
verified ·
1 Parent(s): 9f63609

Upload fine-tuned food embeddings with improved score distribution

Browse files
Files changed (2) hide show
  1. README.md +49 -44
  2. model.safetensors +1 -1
README.md CHANGED
@@ -4,35 +4,35 @@ tags:
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
- - dataset_size:143558
8
  - loss:ScaledCosineSimilarityLoss
9
  base_model: sentence-transformers/all-MiniLM-L6-v2
10
  widget:
11
- - source_sentence: Sapote, mamey, raw
12
  sentences:
13
- - Miracle Noodle Ready To Eat Spaghetti
14
- - Beef ribeye, broiled lean
15
- - Agave, raw
16
- - source_sentence: terebralia semistriata
17
  sentences:
18
- - Lamb, loin boneless lean cooked fast roasted
19
- - Beef, top round roast, boneless, lean, choice, cooked
20
- - mud whelk
21
- - source_sentence: cyttus novaezealandiae
22
  sentences:
23
- - Yehuda Original Matzo-Style Squares
24
- - Beef, outside skirt steak, choice grilled
25
- - dory
26
- - source_sentence: Chocolate Cereal, prepared with water
27
  sentences:
28
- - Soybeans, sprouted cooked steamed
29
- - Granola Cereal, homemade
30
- - Sausage, turkey pork and beef, low fat smoked
31
- - source_sentence: Lamb, Australian leg bottom raw
32
  sentences:
33
- - Lamb, Australian leg whole lean and fat raw
34
- - ipil ipil
35
- - Whitefish, broad head eyes cheeks and soft bones
36
  pipeline_tag: sentence-similarity
37
  library_name: sentence-transformers
38
  metrics:
@@ -49,10 +49,10 @@ model-index:
49
  type: validation
50
  metrics:
51
  - type: pearson_cosine
52
- value: 0.9130932672076298
53
  name: Pearson Cosine
54
  - type: spearman_cosine
55
- value: 0.8292623565808726
56
  name: Spearman Cosine
57
  ---
58
 
@@ -106,9 +106,9 @@ from sentence_transformers import SentenceTransformer
106
  model = SentenceTransformer("jonny9f/food_embeddings4")
107
  # Run inference
108
  sentences = [
109
- 'Lamb, Australian leg bottom raw',
110
- 'Lamb, Australian leg whole lean and fat raw',
111
- 'ipil ipil',
112
  ]
113
  embeddings = model.encode(sentences)
114
  print(embeddings.shape)
@@ -155,8 +155,8 @@ You can finetune this model on your own dataset.
155
 
156
  | Metric | Value |
157
  |:--------------------|:-----------|
158
- | pearson_cosine | 0.9131 |
159
- | **spearman_cosine** | **0.8293** |
160
 
161
  <!--
162
  ## Bias, Risks and Limitations
@@ -177,19 +177,19 @@ You can finetune this model on your own dataset.
177
  #### Unnamed Dataset
178
 
179
 
180
- * Size: 143,558 training samples
181
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
182
  * Approximate statistics based on the first 1000 samples:
183
  | | sentence_0 | sentence_1 | label |
184
  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------|
185
  | type | string | string | float |
186
- | details | <ul><li>min: 3 tokens</li><li>mean: 9.23 tokens</li><li>max: 22 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 9.07 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.06</li><li>mean: 0.61</li><li>max: 1.0</li></ul> |
187
  * Samples:
188
- | sentence_0 | sentence_1 | label |
189
- |:-------------------------------|:------------------------------------------|:--------------------------------|
190
- | <code>yerba mate</code> | <code>ilex paraguariensis st.-hil.</code> | <code>0.6398421867965322</code> |
191
- | <code>huauzontle</code> | <code>lamb's quarters</code> | <code>0.6517711020762244</code> |
192
- | <code>Falafel, homemade</code> | <code>Sofrito sauce, homemade</code> | <code>0.1312318314535366</code> |
193
  * Loss: <code>__main__.ScaledCosineSimilarityLoss</code>
194
 
195
  ### Training Hyperparameters
@@ -324,15 +324,20 @@ You can finetune this model on your own dataset.
324
  ### Training Logs
325
  | Epoch | Step | Training Loss | validation_spearman_cosine |
326
  |:------:|:----:|:-------------:|:--------------------------:|
327
- | 0.1114 | 500 | 0.0253 | - |
328
- | 0.2229 | 1000 | 0.017 | - |
329
- | 0.3343 | 1500 | 0.0159 | - |
330
- | 0.4457 | 2000 | 0.0152 | - |
331
- | 0.5572 | 2500 | 0.0139 | - |
332
- | 0.6686 | 3000 | 0.0135 | - |
333
- | 0.7800 | 3500 | 0.0134 | - |
334
- | 0.8915 | 4000 | 0.0126 | - |
335
- | 1.0 | 4487 | - | 0.8293 |
 
 
 
 
 
336
 
337
 
338
  ### Framework Versions
 
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
+ - dataset_size:210328
8
  - loss:ScaledCosineSimilarityLoss
9
  base_model: sentence-transformers/all-MiniLM-L6-v2
10
  widget:
11
+ - source_sentence: Whale, bowhead oil
12
  sentences:
13
+ - Cheese, American processed with vitamin D
14
+ - Cashews, dry roasted with salt
15
+ - Salmon, dried chum
16
+ - source_sentence: acipenser naccarii bonaparte 1836
17
  sentences:
18
+ - acipenser naccarii bonaparte, 1836
19
+ - butter clam
20
+ - Wild Rice, raw
21
+ - source_sentence: Granola Bar, Nature Valley Chewy Trail Mix
22
  sentences:
23
+ - Sea lion meat, cooked (Alaska Native)
24
+ - Trail Mix, regular unsalted
25
+ - Soup, chunky vegetable, reduced sodium
26
+ - source_sentence: Bear Meat, polar raw
27
  sentences:
28
+ - Tea, tundra herb and Labrador blend
29
+ - Beef rib, small end, choice, cooked roasted
30
+ - sudan teak
31
+ - source_sentence: Beef, tenderloin, raw
32
  sentences:
33
+ - Pork tenderloin, raw
34
+ - Fruit salad, tropical canned in heavy syrup
35
+ - Lamb leg, whole, raw
36
  pipeline_tag: sentence-similarity
37
  library_name: sentence-transformers
38
  metrics:
 
49
  type: validation
50
  metrics:
51
  - type: pearson_cosine
52
+ value: 0.8074689102281711
53
  name: Pearson Cosine
54
  - type: spearman_cosine
55
+ value: 0.7665455013117164
56
  name: Spearman Cosine
57
  ---
58
 
 
106
  model = SentenceTransformer("jonny9f/food_embeddings4")
107
  # Run inference
108
  sentences = [
109
+ 'Beef, tenderloin, raw',
110
+ 'Pork tenderloin, raw',
111
+ 'Lamb leg, whole, raw',
112
  ]
113
  embeddings = model.encode(sentences)
114
  print(embeddings.shape)
 
155
 
156
  | Metric | Value |
157
  |:--------------------|:-----------|
158
+ | pearson_cosine | 0.8075 |
159
+ | **spearman_cosine** | **0.7665** |
160
 
161
  <!--
162
  ## Bias, Risks and Limitations
 
177
  #### Unnamed Dataset
178
 
179
 
180
+ * Size: 210,328 training samples
181
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
182
  * Approximate statistics based on the first 1000 samples:
183
  | | sentence_0 | sentence_1 | label |
184
  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------|
185
  | type | string | string | float |
186
+ | details | <ul><li>min: 3 tokens</li><li>mean: 9.04 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 9.19 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.38</li><li>mean: 0.72</li><li>max: 1.0</li></ul> |
187
  * Samples:
188
+ | sentence_0 | sentence_1 | label |
189
+ |:------------------------------------------------|:--------------------------------------------------|:---------------------------------|
190
+ | <code>Tortilla, plain or mutton sandwich</code> | <code>Roast beef sandwich, plain</code> | <code>0.42789756059646605</code> |
191
+ | <code>Lamb rib, cooked roasted</code> | <code>Lamb, leg shank half, cooked roasted</code> | <code>0.7156221181154251</code> |
192
+ | <code>red raspberry plant</code> | <code>rubus idaeus var. idaeus l.</code> | <code>0.8826086956521739</code> |
193
  * Loss: <code>__main__.ScaledCosineSimilarityLoss</code>
194
 
195
  ### Training Hyperparameters
 
324
  ### Training Logs
325
  | Epoch | Step | Training Loss | validation_spearman_cosine |
326
  |:------:|:----:|:-------------:|:--------------------------:|
327
+ | 0.0761 | 500 | 0.0179 | - |
328
+ | 0.1521 | 1000 | 0.0067 | - |
329
+ | 0.2282 | 1500 | 0.0059 | - |
330
+ | 0.3043 | 2000 | 0.0051 | - |
331
+ | 0.3803 | 2500 | 0.0048 | - |
332
+ | 0.4564 | 3000 | 0.0046 | - |
333
+ | 0.5325 | 3500 | 0.0043 | - |
334
+ | 0.6086 | 4000 | 0.004 | - |
335
+ | 0.6846 | 4500 | 0.0038 | - |
336
+ | 0.7607 | 5000 | 0.0037 | - |
337
+ | 0.8368 | 5500 | 0.0037 | - |
338
+ | 0.9128 | 6000 | 0.0034 | - |
339
+ | 0.9889 | 6500 | 0.0033 | - |
340
+ | 1.0 | 6573 | - | 0.7665 |
341
 
342
 
343
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1288de054125174b0b50ec2dfea722751b6e1a7be787c03e2771b69dc73c8e73
3
  size 90864192
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31e5917015fa832c3eb1819a92e73c59337a0d7a52b6c19765354e2fbbde837d
3
  size 90864192