Arjun-G-Ravi commited on
Commit
d69dfcc
·
verified ·
1 Parent(s): 02f9876

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +6 -6
  2. config.json +1 -1
  3. generation_config.json +4 -0
README.md CHANGED
@@ -1,21 +1,21 @@
1
 
2
  # Custom GPT Model
3
-
4
- This is a custom GPT model with the following modifications from standard GPT-2:
5
- - RMS normalization instead of LayerNorm
6
  - Rotary positional embeddings (RoPE)
7
  - Separate Q,K,V projections
8
  - Squared ReLU activation in MLP
9
  - QK normalization in attention
10
  - Zero initialization for projection layers
11
-
12
- ## Model Architecture
13
  - Vocabulary Size: 50304
14
  - Context Length: 1024
15
  - Number of Layers: 12
16
  - Number of Heads: 6
17
  - Embedding Dimension: 768
18
-
19
  ## Usage
20
  ```python
21
  from transformers import AutoModel
 
1
 
2
  # Custom GPT Model
3
+
4
+ This is a custom GPT model with:
5
+ - RMS normalization
6
  - Rotary positional embeddings (RoPE)
7
  - Separate Q,K,V projections
8
  - Squared ReLU activation in MLP
9
  - QK normalization in attention
10
  - Zero initialization for projection layers
11
+
12
+ ## Architecture
13
  - Vocabulary Size: 50304
14
  - Context Length: 1024
15
  - Number of Layers: 12
16
  - Number of Heads: 6
17
  - Embedding Dimension: 768
18
+
19
  ## Usage
20
  ```python
21
  from transformers import AutoModel
config.json CHANGED
@@ -1,5 +1,4 @@
1
  {
2
- "_attn_implementation_autoset": true,
3
  "architectures": [
4
  "CustomGPTPreTrainedModel"
5
  ],
@@ -9,6 +8,7 @@
9
  "n_head": 6,
10
  "n_layer": 12,
11
  "tokenizer_class": "GPT2Tokenizer",
 
12
  "transformers_version": "4.48.1",
13
  "vocab_size": 50304
14
  }
 
1
  {
 
2
  "architectures": [
3
  "CustomGPTPreTrainedModel"
4
  ],
 
8
  "n_head": 6,
9
  "n_layer": 12,
10
  "tokenizer_class": "GPT2Tokenizer",
11
+ "torch_dtype": "float32",
12
  "transformers_version": "4.48.1",
13
  "vocab_size": 50304
14
  }
generation_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "transformers_version": "4.48.1"
4
+ }