jonny9f commited on
Commit
d8d10f5
·
verified ·
1 Parent(s): 09df92d

Upload food embeddings model

Browse files
Files changed (2) hide show
  1. README.md +45 -45
  2. model.safetensors +1 -1
README.md CHANGED
@@ -4,35 +4,35 @@ tags:
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
- - dataset_size:4256
8
  - loss:ContrastiveLoss
9
  base_model: sentence-transformers/all-mpnet-base-v2
10
  widget:
11
- - source_sentence: So Delicious Key Lime Yogurt
12
  sentences:
13
- - Squash, yellow raw
14
- - Babyfood, mixed fruit yogurt
15
- - Beef, rib eye steak/roast bone-in lip-on raw
16
- - source_sentence: Cocoa Bumpers Cereal, Quaker Mother's
17
  sentences:
18
- - Lovebird Cereal Honey Box
19
- - Ham, canned roasted
20
- - Chicken, light meat with skin, cooked stewed
21
- - source_sentence: Broadbeans, raw immature seeds
22
  sentences:
23
- - Peas, canned rinsed
24
- - Promin Minestrone Soup
25
- - Rice, brown long-grain cooked
26
- - source_sentence: Chicken, dark meat thigh meat and skin, added solution cooked braised
27
  sentences:
28
- - Moose, raw
29
- - Chickpeas, cooked with salt
30
- - Sausage, pork turkey and beef reduced sodium
31
- - source_sentence: Shortening, soy and cottonseed for pastries
32
  sentences:
33
- - Soup, chicken noodle reduced sodium
34
- - Sea lion kidney, Steller (Alaska Native)
35
- - Salad, McDonald's side
36
  pipeline_tag: sentence-similarity
37
  library_name: sentence-transformers
38
  metrics:
@@ -49,10 +49,10 @@ model-index:
49
  type: validation
50
  metrics:
51
  - type: pearson_cosine
52
- value: 0.8269809784218102
53
  name: Pearson Cosine
54
  - type: spearman_cosine
55
- value: 0.845955787172452
56
  name: Spearman Cosine
57
  ---
58
 
@@ -106,9 +106,9 @@ from sentence_transformers import SentenceTransformer
106
  model = SentenceTransformer("jonny9f/food_embeddings4")
107
  # Run inference
108
  sentences = [
109
- 'Shortening, soy and cottonseed for pastries',
110
- 'Sea lion kidney, Steller (Alaska Native)',
111
- 'Soup, chicken noodle reduced sodium',
112
  ]
113
  embeddings = model.encode(sentences)
114
  print(embeddings.shape)
@@ -153,10 +153,10 @@ You can finetune this model on your own dataset.
153
  * Dataset: `validation`
154
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
155
 
156
- | Metric | Value |
157
- |:--------------------|:----------|
158
- | pearson_cosine | 0.827 |
159
- | **spearman_cosine** | **0.846** |
160
 
161
  <!--
162
  ## Bias, Risks and Limitations
@@ -177,19 +177,19 @@ You can finetune this model on your own dataset.
177
  #### Unnamed Dataset
178
 
179
 
180
- * Size: 4,256 training samples
181
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
182
  * Approximate statistics based on the first 1000 samples:
183
- | | sentence_0 | sentence_1 | label |
184
- |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------|
185
- | type | string | string | float |
186
- | details | <ul><li>min: 3 tokens</li><li>mean: 9.91 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 9.96 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.39</li><li>max: 0.85</li></ul> |
187
  * Samples:
188
- | sentence_0 | sentence_1 | label |
189
- |:---------------------------------------------|:------------------------------------------------|:--------------------------------|
190
- | <code>Fava Beans, cooked without salt</code> | <code>Red Kidney Beans, cooked with salt</code> | <code>0.85</code> |
191
- | <code>Spaghetti squash, raw</code> | <code>Mushrooms, white cooked</code> | <code>0.5719985961914062</code> |
192
- | <code>Chicken, back with skin roasted</code> | <code>Beef rib, roasted</code> | <code>0.0</code> |
193
  * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
194
  ```json
195
  {
@@ -202,8 +202,8 @@ You can finetune this model on your own dataset.
202
  ### Training Hyperparameters
203
  #### Non-Default Hyperparameters
204
 
205
- - `per_device_train_batch_size`: 32
206
- - `per_device_eval_batch_size`: 32
207
  - `num_train_epochs`: 1
208
  - `multi_dataset_batch_sampler`: round_robin
209
 
@@ -214,8 +214,8 @@ You can finetune this model on your own dataset.
214
  - `do_predict`: False
215
  - `eval_strategy`: no
216
  - `prediction_loss_only`: True
217
- - `per_device_train_batch_size`: 32
218
- - `per_device_eval_batch_size`: 32
219
  - `per_gpu_train_batch_size`: None
220
  - `per_gpu_eval_batch_size`: None
221
  - `gradient_accumulation_steps`: 1
@@ -331,7 +331,7 @@ You can finetune this model on your own dataset.
331
  ### Training Logs
332
  | Epoch | Step | validation_spearman_cosine |
333
  |:-----:|:----:|:--------------------------:|
334
- | 1.0 | 133 | 0.8460 |
335
 
336
 
337
  ### Framework Versions
 
4
  - sentence-similarity
5
  - feature-extraction
6
  - generated_from_trainer
7
+ - dataset_size:35819
8
  - loss:ContrastiveLoss
9
  base_model: sentence-transformers/all-mpnet-base-v2
10
  widget:
11
+ - source_sentence: Broccoli, stalks raw
12
  sentences:
13
+ - Carrots, canned no salt
14
+ - Squash, Indian raw
15
+ - Biscuit, Popeyes
16
+ - source_sentence: Cereal, General Mills Cheerios
17
  sentences:
18
+ - Chocolate pudding, ready-to-eat
19
+ - Mackerel, Atlantic cooked
20
+ - Cereal, Malt-O-Meal Berry Colossal Crunch
21
+ - source_sentence: Beef Tenderloin, lean cooked broiled
22
  sentences:
23
+ - Elk, tenderloin lean cooked broiled
24
+ - Chicken, capons giblets cooked simmered
25
+ - Barley, pearled cooked
26
+ - source_sentence: Beef, New Zealand eye round slow roasted
27
  sentences:
28
+ - Sorghum flour, white pearled raw
29
+ - Beef, Denver cut steak, grilled
30
+ - Pudding, chocolate instant with 2% milk
31
+ - source_sentence: Beef, shoulder steak boneless grilled
32
  sentences:
33
+ - Pork, bacon, cooked pan-fried
34
+ - Oyster, eastern breaded fried
35
+ - Beef, top blade steak, grilled select
36
  pipeline_tag: sentence-similarity
37
  library_name: sentence-transformers
38
  metrics:
 
49
  type: validation
50
  metrics:
51
  - type: pearson_cosine
52
+ value: 0.8767870213264454
53
  name: Pearson Cosine
54
  - type: spearman_cosine
55
+ value: 0.8665397416848721
56
  name: Spearman Cosine
57
  ---
58
 
 
106
  model = SentenceTransformer("jonny9f/food_embeddings4")
107
  # Run inference
108
  sentences = [
109
+ 'Beef, shoulder steak boneless grilled',
110
+ 'Beef, top blade steak, grilled select',
111
+ 'Pork, bacon, cooked pan-fried',
112
  ]
113
  embeddings = model.encode(sentences)
114
  print(embeddings.shape)
 
153
  * Dataset: `validation`
154
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
155
 
156
+ | Metric | Value |
157
+ |:--------------------|:-----------|
158
+ | pearson_cosine | 0.8768 |
159
+ | **spearman_cosine** | **0.8665** |
160
 
161
  <!--
162
  ## Bias, Risks and Limitations
 
177
  #### Unnamed Dataset
178
 
179
 
180
+ * Size: 35,819 training samples
181
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
182
  * Approximate statistics based on the first 1000 samples:
183
+ | | sentence_0 | sentence_1 | label |
184
+ |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------|
185
+ | type | string | string | float |
186
+ | details | <ul><li>min: 3 tokens</li><li>mean: 10.09 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 9.88 tokens</li><li>max: 24 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.33</li><li>max: 0.85</li></ul> |
187
  * Samples:
188
+ | sentence_0 | sentence_1 | label |
189
+ |:---------------------------------------------------------------|:-------------------------------------------------------|:--------------------------------|
190
+ | <code>Instant Oats, maple and brown sugar fortified dry</code> | <code>Chocolate frosting, creamy dry mix</code> | <code>0.0</code> |
191
+ | <code>Fried Chicken Breast, meat only extra crispy KFC</code> | <code>Brothers Natural Fruit Crisps Strawberry</code> | <code>0.0</code> |
192
+ | <code>Sesame seed dressing, regular</code> | <code>Italian dressing, fat-free salad dressing</code> | <code>0.7745922088623046</code> |
193
  * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
194
  ```json
195
  {
 
202
  ### Training Hyperparameters
203
  #### Non-Default Hyperparameters
204
 
205
+ - `per_device_train_batch_size`: 128
206
+ - `per_device_eval_batch_size`: 128
207
  - `num_train_epochs`: 1
208
  - `multi_dataset_batch_sampler`: round_robin
209
 
 
214
  - `do_predict`: False
215
  - `eval_strategy`: no
216
  - `prediction_loss_only`: True
217
+ - `per_device_train_batch_size`: 128
218
+ - `per_device_eval_batch_size`: 128
219
  - `per_gpu_train_batch_size`: None
220
  - `per_gpu_eval_batch_size`: None
221
  - `gradient_accumulation_steps`: 1
 
331
  ### Training Logs
332
  | Epoch | Step | validation_spearman_cosine |
333
  |:-----:|:----:|:--------------------------:|
334
+ | 1.0 | 280 | 0.8665 |
335
 
336
 
337
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:44512fb0fe9566fddc194c4ecd20617070852653233eccc58ae88f6fc48e2c73
3
  size 437967672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d1cb1586ee53cc57a2af6f1c1b763aa44287d52636dcd9d832e724a9f9fec6f
3
  size 437967672