---
base_model:
- Qwen/Qwen2.5-Coder-0.5B
---
The code embedding model trained by Jina AI.
# Jina Embeddings c1: A Small but Performant Code Embedding Model ## Intended Usage & Model Info `jina-embeddings-c1` is an embedding model for code retrieval. The model supports various types of code retrieval (text-to-code, code-to-code, code-to-text, code-to-completion) and technical question answering across 15+ programming languages. Built on [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B), `jina-embeddings-c1` features: - **Multilingual support** (15+ programming languages) and compatibility with a wide range of domains, including web development, software development, machine learning, data science, and educational coding problems. - **Task-specific instruction prefixes** for NL2Code, Code2Code, Code2NL, Code2Completion, and Technical QA, which can be selected at inference time. - **Flexible embedding size**: dense embeddings are 896-dimensional by default but can be truncated to as low as 64 with minimal performance loss. Summary of features: | Feature | Jina Embeddings C1 | |------------|------------| | Base Model | Qwen2.5-Coder-0.5B | | Supported Tasks | `nl2code`, `code2code`, `code2nl`, `code2completion`, `qa` | | Model DType | BFloat 16 | | Max Sequence Length | 32768 | | Embedding Vector Dimension | 896 | | Matryoshka dimensions | 64, 128, 256, 512, 896 | | Pooling Strategy | Last-token pooling | | Attention Mechanism | FlashAttention2 | ## Usage