Not-For-All-Audiences

Model card Files Files and versions Community

Strawberrylemonade-L3-70B-v1.1-exl2 / README.md

MikeRoz

Update README.md

02309ad verified 2 months ago

preview code

raw

history blame contribute delete

1.15 kB

metadata

license: llama3.3
inference: false
base_model: sophosympatheia/Strawberrylemonade-L3-70B-v1.1
base_model_relation: quantized
tags:
  - exl2
  - not-for-all-audiences
library_name: exllamav2
pipeline_tag: text-generation

exllamav2 quantizations of sophosympatheia's Strawberrylemonade-L3-70B-v1.1

I think I'm going to start phasing out the smaller exl2 sizes in favor of exl3. It performs pretty well even without tensor parallelism. Keeping the big ones for now because they still benefit a lot from exl2's tensor parallelism. Let me know if you'd like to see a smaller size in exl2.

4.25bpw h6 (36.623 GiB)
6.00bpw h6 (50.568 GiB)
8.00bpw h8 (66.710 GiB)
measurement.json