Update README.md
Browse files
README.md
CHANGED
@@ -14,14 +14,14 @@ license: cc-by-nc-4.0
|
|
14 |
<b>The code embedding model trained by <a href="https://jina.ai/"><b>Jina AI</b></a>.</b>
|
15 |
</p>
|
16 |
|
17 |
-
# Jina Embeddings
|
18 |
|
19 |
## Intended Usage & Model Info
|
20 |
-
`jina-embeddings
|
21 |
The model supports various types of code retrieval (text-to-code, code-to-code, code-to-text, code-to-completion) and technical question answering across 15+ programming languages.
|
22 |
|
23 |
|
24 |
-
Built on [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B), `jina-embeddings-
|
25 |
|
26 |
- **Multilingual support** (15+ programming languages) and compatibility with a wide range of domains, including web development, software development, machine learning, data science, and educational coding problems.
|
27 |
- **Task-specific instruction prefixes** for NL2Code, Code2Code, Code2NL, Code2Completion, and Technical QA, which can be selected at inference time.
|
@@ -30,7 +30,7 @@ Built on [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5
|
|
30 |
|
31 |
Summary of features:
|
32 |
|
33 |
-
| Feature | Jina Embeddings
|
34 |
|------------|------------|
|
35 |
| Base Model | Qwen2.5-Coder-0.5B |
|
36 |
| Supported Tasks | `nl2code`, `code2code`, `code2nl`, `code2completion`, `qa` |
|
@@ -66,7 +66,7 @@ from transformers import AutoModel
|
|
66 |
import torch
|
67 |
|
68 |
# Initialize the model
|
69 |
-
model = AutoModel.from_pretrained("jinaai/jina-embeddings-
|
70 |
model.to("cuda")
|
71 |
|
72 |
# Configure truncate_dim, max_length, batch_size in the encode function if needed
|
@@ -98,7 +98,7 @@ from sentence_transformers import SentenceTransformer
|
|
98 |
|
99 |
# Load the model
|
100 |
model = SentenceTransformer(
|
101 |
-
"jinaai/jina-embeddings-
|
102 |
model_kwargs={
|
103 |
"torch_dtype": torch.bfloat16,
|
104 |
"attn_implementation": "flash_attention_2",
|
@@ -129,7 +129,7 @@ print(similarity)
|
|
129 |
|
130 |
## Training & Evaluation
|
131 |
|
132 |
-
Please refer to our technical report of jina-embeddings
|
133 |
|
134 |
## Contact
|
135 |
|
|
|
14 |
<b>The code embedding model trained by <a href="https://jina.ai/"><b>Jina AI</b></a>.</b>
|
15 |
</p>
|
16 |
|
17 |
+
# Jina Code Embeddings: A Small but Performant Code Embedding Model
|
18 |
|
19 |
## Intended Usage & Model Info
|
20 |
+
`jina-code-embeddings` is an embedding model for code retrieval.
|
21 |
The model supports various types of code retrieval (text-to-code, code-to-code, code-to-text, code-to-completion) and technical question answering across 15+ programming languages.
|
22 |
|
23 |
|
24 |
+
Built on [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B), `jina-code-embeddings-0.5b` features:
|
25 |
|
26 |
- **Multilingual support** (15+ programming languages) and compatibility with a wide range of domains, including web development, software development, machine learning, data science, and educational coding problems.
|
27 |
- **Task-specific instruction prefixes** for NL2Code, Code2Code, Code2NL, Code2Completion, and Technical QA, which can be selected at inference time.
|
|
|
30 |
|
31 |
Summary of features:
|
32 |
|
33 |
+
| Feature | Jina Code Embeddings 0.5B |
|
34 |
|------------|------------|
|
35 |
| Base Model | Qwen2.5-Coder-0.5B |
|
36 |
| Supported Tasks | `nl2code`, `code2code`, `code2nl`, `code2completion`, `qa` |
|
|
|
66 |
import torch
|
67 |
|
68 |
# Initialize the model
|
69 |
+
model = AutoModel.from_pretrained("jinaai/jina-code-embeddings-0.5b", trust_remote_code=True)
|
70 |
model.to("cuda")
|
71 |
|
72 |
# Configure truncate_dim, max_length, batch_size in the encode function if needed
|
|
|
98 |
|
99 |
# Load the model
|
100 |
model = SentenceTransformer(
|
101 |
+
"jinaai/jina-code-embeddings-0.5b",
|
102 |
model_kwargs={
|
103 |
"torch_dtype": torch.bfloat16,
|
104 |
"attn_implementation": "flash_attention_2",
|
|
|
129 |
|
130 |
## Training & Evaluation
|
131 |
|
132 |
+
Please refer to our technical report of jina-code-embeddings for training details and benchmarks.
|
133 |
|
134 |
## Contact
|
135 |
|