hypothetical commited on
Commit
804e763
·
verified ·
1 Parent(s): f835c5b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -35
README.md CHANGED
@@ -47,7 +47,6 @@ from elastic_models.transformers import AutoModelForCausalLM
47
  # model confugaration as well
48
  model_name = "mistralai/Mistral-7B-Instruct-v0.3"
49
  hf_token = ''
50
- hf_cache_dir = ''
51
  device = torch.device("cuda")
52
 
53
  # Create mode
@@ -57,7 +56,6 @@ tokenizer = AutoTokenizer.from_pretrained(
57
  model = AutoModelForCausalLM.from_pretrained(
58
  model_name,
59
  token=hf_token,
60
- cache_dir=hf_cache_dir,
61
  torch_dtype=torch.bfloat16,
62
  attn_implementation="sdpa",
63
  mode='s'
@@ -85,28 +83,29 @@ print(f"# Q:\n{prompt}\n")
85
  print(f"# A:\n{output}\n")
86
  ```
87
 
88
- ### Installation
89
-
90
-
91
- __System requirements__
92
-
93
  * GPUs: H100, L40s
94
-
95
  * CPU: AMD, Intel
96
-
97
- * OS: Linux #TODO
98
-
99
  * Python: 3.10-3.12
100
 
101
 
102
- To work with our models
103
 
104
  ```shell
105
  pip install thestage
106
  pip install elastic_models
 
 
 
 
 
 
 
 
 
107
  ```
108
 
109
- Then go to app.thestage.ai, login and generate API token from your profile page. Set up API token as follows:
110
 
111
  ```shell
112
  thestage config set --api-token <YOUR_API_TOKEN>
@@ -126,10 +125,10 @@ For quality evaluation we have used: #TODO link to github
126
 
127
  | Metric/Model | S | M | L | XL | Original | W8A8, int8 |
128
  |---------------|---|---|---|----|----------|------------|
129
- | MMLU | 0 | 0 | 0 | 0 | 0 | 0 |
130
- | PIQA | 0 | 0 | 0 | 0 | 0 | 0 |
131
- | Arc Challenge | 0 | 0 | 0 | 0 | 0 | 0 |
132
- | Winogrande | 0 | 0 | 0 | 0 | 0 | 0 |
133
 
134
 
135
  * **MMLU**:Evaluates general knowledge across 57 subjects including science, humanities, engineering, and more. Shows model's ability to handle diverse academic topics.
@@ -139,32 +138,18 @@ For quality evaluation we have used: #TODO link to github
139
 
140
  ### Latency benchmarks
141
 
142
- We have profiled models in different scenarios:
143
-
144
- <table>
145
- <tr><th> 100 input/300 output; tok/s </th><th> 1000 input/1000 output; tok/s </th></tr>
146
- <tr><td>
147
-
148
- | GPU/Model | S | M | L | XL | Original | W8A8, int8 |
149
- |-----------|-----|---|---|----|----------|------------|
150
- | H100 | 189 | 0 | 0 | 0 | 48 | 0 |
151
- | L40s | 79 | 0 | 0 | 0 | 42 | 0 |
152
-
153
-
154
-
155
- </td><td>
156
 
157
  | GPU/Model | S | M | L | XL | Original | W8A8, int8 |
158
  |-----------|-----|---|---|----|----------|------------|
159
- | H100 | 189 | 0 | 0 | 0 | 48 | 0 |
160
- | L40s | 79 | 0 | 0 | 0 | 42 | 0 |
161
 
162
- </td></tr> </table>
163
 
164
 
165
  ## Links
166
 
167
  * __Platform__: [app.thestage.ai](app.thestage.ai)
168
- * __Elastic models Github__: [app.thestage.ai](app.thestage.ai)
169
  * __Subscribe for updates__: [TheStageAI X](https://x.com/TheStageAI)
170
  * __Contact email__: [email protected]
 
47
  # model confugaration as well
48
  model_name = "mistralai/Mistral-7B-Instruct-v0.3"
49
  hf_token = ''
 
50
  device = torch.device("cuda")
51
 
52
  # Create mode
 
56
  model = AutoModelForCausalLM.from_pretrained(
57
  model_name,
58
  token=hf_token,
 
59
  torch_dtype=torch.bfloat16,
60
  attn_implementation="sdpa",
61
  mode='s'
 
83
  print(f"# A:\n{output}\n")
84
  ```
85
 
86
+ __System requirements:__
 
 
 
 
87
  * GPUs: H100, L40s
 
88
  * CPU: AMD, Intel
 
 
 
89
  * Python: 3.10-3.12
90
 
91
 
92
+ To work with our models just run these lines in your terminal:
93
 
94
  ```shell
95
  pip install thestage
96
  pip install elastic_models
97
+ pip install flash_attn==2.7.3 --no-build-isolation
98
+ pip uninstall apex
99
+ echo "{
100
+ "meta-llama/Llama-3.2-1B-Instruct": 6,
101
+ "mistralai/Mistral-7B-Instruct-v0.3": 7,
102
+ "black-forest-labs/FLUX.1-schnell": 1,
103
+ "black-forest-labs/FLUX.1-dev": 5
104
+ }" > model_name_id.json
105
+ export ELASTIC_MODEL_ID_MAPPING=./model_name_id.json
106
  ```
107
 
108
+ Then go to [app.thestage.ai](https://app.thestage.ai), login and generate API token from your profile page. Set up API token as follows:
109
 
110
  ```shell
111
  thestage config set --api-token <YOUR_API_TOKEN>
 
125
 
126
  | Metric/Model | S | M | L | XL | Original | W8A8, int8 |
127
  |---------------|---|---|---|----|----------|------------|
128
+ | MMLU | 59.7 | 60.1 | 60.8 | 61.4 | 61.4 | 28 |
129
+ | PIQA | 80.8 | 82 | 81.7 | 81.5 | 81.5 | 65.3 |
130
+ | Arc Challenge | 56.6 | 55.1 | 56.8 | 57.4 | 57.4 | 33.2 |
131
+ | Winogrande | 73.2 | 72.3 | 73.2 | 74.1 | 74.1 | 57 |
132
 
133
 
134
  * **MMLU**:Evaluates general knowledge across 57 subjects including science, humanities, engineering, and more. Shows model's ability to handle diverse academic topics.
 
138
 
139
  ### Latency benchmarks
140
 
141
+ __100 input/300 output; tok/s:__
 
 
 
 
 
 
 
 
 
 
 
 
 
142
 
143
  | GPU/Model | S | M | L | XL | Original | W8A8, int8 |
144
  |-----------|-----|---|---|----|----------|------------|
145
+ | H100 | 189 | 166 | 148 | 134 | 49 | 192 |
146
+ | L40s | 79 | 68 | 59 | 47 | 38 | 82 |
147
 
 
148
 
149
 
150
  ## Links
151
 
152
  * __Platform__: [app.thestage.ai](app.thestage.ai)
153
+ <!-- * __Elastic models Github__: [app.thestage.ai](app.thestage.ai) -->
154
  * __Subscribe for updates__: [TheStageAI X](https://x.com/TheStageAI)
155
  * __Contact email__: [email protected]