DeusImperator
/

Midnight-Miqu-70B-v1.5_exl2_2.4bpw_rpcal

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

DeusImperator commited on May 19, 2024

Commit

6727b6a

·

verified ·

1 Parent(s): 4aa3588

Upload README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ This quant was made using exllamav2-0.0.21 with [pippa dataset](https://huggingf
 This quant fits over 20k context on 24GB VRAM on Windows in my local testing (with exl2 Q4 cache), you might be able to get more depending on other things taking VRAM.
-I tested this quant shortly in some random RPs (including one over 8k context) and it seems to work fine.
 ## Prompt Templates
@@ -35,7 +35,7 @@ Further details on prompting this model will also pop up under the [model discus
 2.4bpw exl2 quant on default dataset: [Midnight-Miqu-70B-v1.5_exl2_2.4bpw](https://huggingface.co/DeusImperator/Midnight-Miqu-70B-v1.5_exl2_2.4bpw)
-The above quant might be a little smarter based on limited testing, but this rpcal one might be a bit better for RP
 ### Original readme below

 This quant fits over 20k context on 24GB VRAM on Windows in my local testing (with exl2 Q4 cache), you might be able to get more depending on other things taking VRAM.
+I tested this quant shortly in some random RPs (including ones over 8k and 20k context) and it seems to work fine.
 ## Prompt Templates
 2.4bpw exl2 quant on default dataset: [Midnight-Miqu-70B-v1.5_exl2_2.4bpw](https://huggingface.co/DeusImperator/Midnight-Miqu-70B-v1.5_exl2_2.4bpw)
+The above quant might be a little smarter based on limited testing, but this rpcal one might be a bit better for RP.
 ### Original readme below