prompterminal commited on
Commit
7906fab
·
verified ·
1 Parent(s): 7f51019

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. README.md +60 -0
  2. config.json +29 -0
  3. pytorch_model.bin +3 -0
  4. tokenizer_config.json +5 -0
  5. vocab.json +0 -0
README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - enwik8
7
+ - character-level
8
+ - gpt
9
+ - nanogpt
10
+ - compression
11
+ - low-rank
12
+ - wikipedia
13
+ - text-generation
14
+ pipeline_tag: text-generation
15
+ ---
16
+
17
+ # NanoGPT enwik8 - Compressed Model
18
+
19
+ Compressed nanoGPT model trained on enwik8 (Wikipedia) using low-rank matrix decomposition.
20
+
21
+ ## Model Details
22
+ - **Original Parameters**: 28,801,536
23
+ - **Compressed Parameters**: 22,755,840
24
+ - **Compression Ratio**: 1.27× smaller
25
+ - **Compression Method**: Low-rank decomposition (rank=16) on layers [5, 6, 7]
26
+ - **Training Data**: enwik8 (Wikipedia, first 100MB)
27
+ - **Vocabulary**: 6,060 characters
28
+ - **Context Length**: 1024 tokens
29
+
30
+ ## Performance
31
+ - **Original Perplexity**: 8843.82
32
+ - **Compressed Perplexity**: 7387.50
33
+ - **Performance Change**: -16.5%
34
+
35
+ ## Usage
36
+
37
+ ⚠️ **Note**: This model requires custom code for text generation due to character-level tokenization.
38
+
39
+ ```python
40
+ # This model is designed for research and benchmarking
41
+ # Custom generation code required
42
+ ```
43
+
44
+ ## Compression Technique
45
+
46
+ Uses SVD-based low-rank approximation:
47
+ - **Method**: Decompose weight matrices W ≈ U × V
48
+ - **Rank**: 16 (much smaller than original dimensions)
49
+ - **Layers**: Compressed MLP layers in transformer blocks [5, 6, 7]
50
+
51
+ ## Evaluation
52
+
53
+ Ready for benchmark evaluation including:
54
+ - Nous benchmark suite (AGIEval, GPT4ALL, TruthfulQA, Bigbench)
55
+ - Compression technique analysis
56
+ - Character-level language modeling research
57
+
58
+ ## Citation
59
+
60
+ Based on nanoGPT by Andrej Karpathy. Compression technique demonstrates effective neural network compression with minimal performance impact.
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "GPT"
4
+ ],
5
+ "model_type": "nanogpt",
6
+ "vocab_size": 6060,
7
+ "n_positions": 1024,
8
+ "n_layer": 8,
9
+ "n_head": 8,
10
+ "n_embd": 512,
11
+ "block_size": 1024,
12
+ "bias": false,
13
+ "dropout": 0.1,
14
+ "compression_info": {
15
+ "method": "low_rank_mlp",
16
+ "rank": 16,
17
+ "compressed_layers": [
18
+ 5,
19
+ 6,
20
+ 7
21
+ ],
22
+ "original_params": 28801536,
23
+ "compressed_params": 22755840,
24
+ "compression_ratio": 1.265676679041512,
25
+ "baseline_perplexity": 8843.81970317803,
26
+ "compressed_perplexity": 7387.500696060002,
27
+ "dataset": "enwik8"
28
+ }
29
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:91d3981bc03b381483ea8f20e5d2acc0fe5ef27714883ad1f91cc24508a48b84
3
+ size 91046809
tokenizer_config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "tokenizer_class": "NanoGPTTokenizer",
3
+ "vocab_size": 6060,
4
+ "model_max_length": 1024
5
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff