Data and filtering models of our financial open-source YiZhao Dataset.
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
Text Machine Group (TMG) from Harbin Institute of Technology (Shenzhen). 🔥
-
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model
Paper • 2501.01028 • Published • 17 -
KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model
Paper • 2506.20923 • Published • 4 -
HIT-TMG/KaLM-embedding-multilingual-mini-v1
Sentence Similarity • 0.5B • Updated • 1.14k • 27 -
HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1
Sentence Similarity • 0.5B • Updated • 102 • 32
Data and filtering models of our financial open-source YiZhao Dataset.
-
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model
Paper • 2501.01028 • Published • 17 -
KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model
Paper • 2506.20923 • Published • 4 -
HIT-TMG/KaLM-embedding-multilingual-mini-v1
Sentence Similarity • 0.5B • Updated • 1.14k • 27 -
HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1
Sentence Similarity • 0.5B • Updated • 102 • 32
models
23

HIT-TMG/EviOmni-nq_train-1.5B
Question Answering
•
2B
•
Updated
•
159
•
5

HIT-TMG/EviOmni-nq_train-7B
Question Answering
•
8B
•
Updated
•
306
•
2

HIT-TMG/CIGEval-Qwen2.5-VL-7B-Instruct-sft
8B
•
Updated
•
8

HIT-TMG/CIGEval-Qwen2-VL-7B-Instruct-sft
8B
•
Updated
•
10

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v2
Feature Extraction
•
0.5B
•
Updated
•
490
•
25

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1-GGUF
Sentence Similarity
•
0.5B
•
Updated
•
25

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5-GGUF
Sentence Similarity
•
0.5B
•
Updated
•
36

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5
Sentence Similarity
•
0.5B
•
Updated
•
2.12k
•
•
60

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1
Sentence Similarity
•
0.5B
•
Updated
•
102
•
32

HIT-TMG/KaLM-embedding-multilingual-mini-unsupervised
0.5B
•
Updated
•
3
datasets
6
HIT-TMG/CIGEval_sft_data
Viewer
•
Updated
•
6.63k
•
170
HIT-TMG/YiZhao
Viewer
•
Updated
•
47.5M
•
294
•
6
HIT-TMG/KaLM-embedding-pretrain-data
Viewer
•
Updated
•
23.7M
•
465
•
5
HIT-TMG/MultiSkill
Viewer
•
Updated
•
1k
•
2
HIT-TMG/TruthReader_RAG_train
Viewer
•
Updated
•
7.16k
•
23
•
6
HIT-TMG/Hansel
Viewer
•
Updated
•
7.81M
•
3.51k
•
8