Vezora commited on
Commit
229f2a5
·
1 Parent(s): d88a42d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -4,7 +4,7 @@ license: apache-2.0
4
  <!DOCTYPE html>
5
  <img src="https://imgur.com/a/3HUIVxJ" width="300">
6
 
7
- # Mistral 15b: A New Base Model
8
 
9
  The objective of this model is to serve as a new base model for Mistral 14b. It has been enhanced with a LoRa adapter attached to all 62 layers of the merged model. The model is capable of generating outputs and responding accurately to inputs. However, it tends to over-respond with unasked questions when asked to process more than 512 tokens, which is its training limit using QLoRa.
10
 
@@ -29,4 +29,4 @@ Initially, the output from the model was pure jargon. To rectify this, a LoRa ad
29
  - speechless-code-mistral-7b-v1.0 (https://huggingface.co/uukuguy/speechless-code-mistral-7b-v1.0)
30
 
31
  ## upcoming Mistral 30b
32
- - We currently have a Mistral model with 30 billion parameters(29.6B params) in development. At present, the model's output is not yet refined and may appear as jargon. If there is interest in the community for fine-tuning this model, we are open to uploading it in its current state. Otherwise, we plan to complete our training process before making it available. You can let us know with a post in this repo's discussion's!
 
4
  <!DOCTYPE html>
5
  <img src="https://imgur.com/a/3HUIVxJ" width="300">
6
 
7
+ # Mistral 14b: A New Base Model
8
 
9
  The objective of this model is to serve as a new base model for Mistral 14b. It has been enhanced with a LoRa adapter attached to all 62 layers of the merged model. The model is capable of generating outputs and responding accurately to inputs. However, it tends to over-respond with unasked questions when asked to process more than 512 tokens, which is its training limit using QLoRa.
10
 
 
29
  - speechless-code-mistral-7b-v1.0 (https://huggingface.co/uukuguy/speechless-code-mistral-7b-v1.0)
30
 
31
  ## upcoming Mistral 30b
32
+ - We currently have a Mistral model with 29 billion parameters(29.3B params) in development. At present, the model's output is not yet refined and may appear as jargon. If there is interest in the community for fine-tuning this model, we are open to uploading it in its current state. Otherwise, we plan to complete our training process before making it available. You can let us know with a post in this repo's discussion's!