prithivMLmods commited on
Commit
03480ea
·
verified ·
1 Parent(s): c49e02f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -1
README.md CHANGED
@@ -8,4 +8,36 @@ pipeline_tag: text-generation
8
  library_name: transformers
9
  tags:
10
  - text-generation-inference
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  library_name: transformers
9
  tags:
10
  - text-generation-inference
11
+ ---
12
+
13
+ # **Jan-nano-GGUF**
14
+
15
+ > Jan-Nano is a compact 4-billion parameter language model specifically designed and trained for deep research tasks. This model has been optimized to work seamlessly with Model Context Protocol (MCP) servers, enabling efficient integration with various research tools and data sources.
16
+
17
+ ## Model Files
18
+
19
+ | File Name | Size | Format | Description |
20
+ |-----------|------|--------|-------------|
21
+ | Jan-nano.F32.gguf | 16.1 GB | F32 | Full precision 32-bit floating point |
22
+ | Jan-nano.F16.gguf | 8.05 GB | F16 | Half precision 16-bit floating point |
23
+ | Jan-nano.BF16.gguf | 8.05 GB | BF16 | Brain floating point 16-bit |
24
+
25
+ ## Usage
26
+
27
+ These GGUF format files are optimized for use with llama.cpp and compatible inference engines. Choose the appropriate precision level based on your hardware capabilities and quality requirements:
28
+
29
+ - **F32**: Highest quality, requires most memory
30
+ - **F16/BF16**: Good balance of quality and memory efficiency
31
+
32
+ ## Configuration
33
+
34
+ The model configuration is available in `config.json`.
35
+
36
+ ## Quants Usage
37
+
38
+ (sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
39
+
40
+ Here is a handy graph by ikawrakow comparing some lower-quality quant
41
+ types (lower is better):
42
+
43
+ ![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)