Update README.md
Browse files
README.md
CHANGED
@@ -17,13 +17,13 @@ base_model:
|
|
17 |
|
18 |
## Intended Usage & Model Info
|
19 |
`jina-embeddings-c1` is an embedding model for code retrieval.
|
20 |
-
The model supports natural language-to-code, code-to-code,
|
21 |
|
22 |
|
23 |
Built on [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B), `jina-embeddings-c1` features:
|
24 |
|
25 |
-
- **Multilingual support** (15+ programming languages) and compatibility with a wide range of domains, including web development, machine learning,
|
26 |
-
- **Task-specific instruction prefixes** for NL2Code, Code2Code, Code2NL,
|
27 |
- **Flexible embedding size**: dense embeddings are 896-dimensional by default but can be truncated to as low as 64 with minimal performance loss.
|
28 |
|
29 |
|
@@ -32,7 +32,7 @@ Summary of features:
|
|
32 |
| Feature | Jina Embeddings C1 |
|
33 |
|------------|------------|
|
34 |
| Base Model | Qwen2.5-Coder-0.5B |
|
35 |
-
| Supported Tasks | `nl2code`, `code2code`, `code2nl`, `
|
36 |
| Model DType | BFloat 16 |
|
37 |
| Max Sequence Length | 32768 |
|
38 |
| Embedding Vector Dimension | 896 |
|
@@ -40,13 +40,10 @@ Summary of features:
|
|
40 |
| Pooling Strategy | Last-token pooling |
|
41 |
| Attention Mechanism | FlashAttention2 |
|
42 |
|
43 |
-
|
44 |
-
|
45 |
## Training & Evaluation
|
46 |
|
47 |
Please refer to our technical report of jina-embeddings-c1 for training details and benchmarks.
|
48 |
|
49 |
## Contact
|
50 |
|
51 |
-
Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.
|
52 |
-
```
|
|
|
17 |
|
18 |
## Intended Usage & Model Info
|
19 |
`jina-embeddings-c1` is an embedding model for code retrieval.
|
20 |
+
The model supports various types of code retrieval (natural language-to-code, code-to-code, code-to-natural language, code-to-completion) and technical question answering across 15+ programming languages.
|
21 |
|
22 |
|
23 |
Built on [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B), `jina-embeddings-c1` features:
|
24 |
|
25 |
+
- **Multilingual support** (15+ programming languages) and compatibility with a wide range of domains, including web development, software development, machine learning, data science, and educational coding problems.
|
26 |
+
- **Task-specific instruction prefixes** for NL2Code, Code2Code, Code2NL, Code2Completion, and Technical QA, which can be selected at inference time.
|
27 |
- **Flexible embedding size**: dense embeddings are 896-dimensional by default but can be truncated to as low as 64 with minimal performance loss.
|
28 |
|
29 |
|
|
|
32 |
| Feature | Jina Embeddings C1 |
|
33 |
|------------|------------|
|
34 |
| Base Model | Qwen2.5-Coder-0.5B |
|
35 |
+
| Supported Tasks | `nl2code`, `code2code`, `code2nl`, `code2completion`, `qa` |
|
36 |
| Model DType | BFloat 16 |
|
37 |
| Max Sequence Length | 32768 |
|
38 |
| Embedding Vector Dimension | 896 |
|
|
|
40 |
| Pooling Strategy | Last-token pooling |
|
41 |
| Attention Mechanism | FlashAttention2 |
|
42 |
|
|
|
|
|
43 |
## Training & Evaluation
|
44 |
|
45 |
Please refer to our technical report of jina-embeddings-c1 for training details and benchmarks.
|
46 |
|
47 |
## Contact
|
48 |
|
49 |
+
Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.
|
|