Vezora
/

Mistral-14b-Merge-Base

Text Generation

text-generation-inference

Model card Files Files and versions Community

Vezora commited on Nov 3, 2023

Commit

229f2a5

·

1 Parent(s): d88a42d

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ license: apache-2.0
 <!DOCTYPE html>
 <img src="https://imgur.com/a/3HUIVxJ" width="300">
-# Mistral 15b: A New Base Model
 The objective of this model is to serve as a new base model for Mistral 14b. It has been enhanced with a LoRa adapter attached to all 62 layers of the merged model. The model is capable of generating outputs and responding accurately to inputs. However, it tends to over-respond with unasked questions when asked to process more than 512 tokens, which is its training limit using QLoRa.
@@ -29,4 +29,4 @@ Initially, the output from the model was pure jargon. To rectify this, a LoRa ad
 - speechless-code-mistral-7b-v1.0 (https://huggingface.co/uukuguy/speechless-code-mistral-7b-v1.0)
 ## upcoming Mistral 30b
-- We currently have a Mistral model with 30 billion parameters(29.6B params) in development. At present, the model's output is not yet refined and may appear as jargon. If there is interest in the community for fine-tuning this model, we are open to uploading it in its current state. Otherwise, we plan to complete our training process before making it available. You can let us know with a post in this repo's discussion's!

 <!DOCTYPE html>
 <img src="https://imgur.com/a/3HUIVxJ" width="300">
+# Mistral 14b: A New Base Model
 The objective of this model is to serve as a new base model for Mistral 14b. It has been enhanced with a LoRa adapter attached to all 62 layers of the merged model. The model is capable of generating outputs and responding accurately to inputs. However, it tends to over-respond with unasked questions when asked to process more than 512 tokens, which is its training limit using QLoRa.
 - speechless-code-mistral-7b-v1.0 (https://huggingface.co/uukuguy/speechless-code-mistral-7b-v1.0)
 ## upcoming Mistral 30b
+- We currently have a Mistral model with 29 billion parameters(29.3B params) in development. At present, the model's output is not yet refined and may appear as jargon. If there is interest in the community for fine-tuning this model, we are open to uploading it in its current state. Otherwise, we plan to complete our training process before making it available. You can let us know with a post in this repo's discussion's!