prithivMLmods
/

Jan-nano-F32-GGUF

Text Generation

text-generation-inference

Model card Files Files and versions

prithivMLmods commited on Jun 16

Commit

03480ea

·

verified ·

1 Parent(s): c49e02f

Update README.md

Files changed (1) hide show

README.md +33 -1

README.md CHANGED Viewed

@@ -8,4 +8,36 @@ pipeline_tag: text-generation
 library_name: transformers
 tags:
 - text-generation-inference
----

 library_name: transformers
 tags:
 - text-generation-inference
+---
+# **Jan-nano-GGUF**
+> Jan-Nano is a compact 4-billion parameter language model specifically designed and trained for deep research tasks. This model has been optimized to work seamlessly with Model Context Protocol (MCP) servers, enabling efficient integration with various research tools and data sources.
+## Model Files
+| File Name | Size | Format | Description |
+|-----------|------|--------|-------------|
+| Jan-nano.F32.gguf | 16.1 GB | F32 | Full precision 32-bit floating point |
+| Jan-nano.F16.gguf | 8.05 GB | F16 | Half precision 16-bit floating point |
+| Jan-nano.BF16.gguf | 8.05 GB | BF16 | Brain floating point 16-bit |
+## Usage
+These GGUF format files are optimized for use with llama.cpp and compatible inference engines. Choose the appropriate precision level based on your hardware capabilities and quality requirements:
+- **F32**: Highest quality, requires most memory
+- **F16/BF16**: Good balance of quality and memory efficiency
+## Configuration
+The model configuration is available in `config.json`.
+## Quants Usage
+(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
+Here is a handy graph by ikawrakow comparing some lower-quality quant
+types (lower is better):
+![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)