ChenJing
CelesteChen
·
AI & ML interests
Computer Vision, Natural Language Generation
Recent Activity
updated
a collection
1 day ago
deepsearch
updated
a collection
1 day ago
Align
updated
a collection
8 days ago
RAG
Organizations
None yet
acceleration
deepsearch
code
multilingual
-
Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages
Paper • 2411.12240 • Published • 7 -
LLäMmlein: Compact and Competitive German-Only Language Models from Scratch
Paper • 2411.11171 • Published • 8 -
Xmodel-1.5: An 1B-scale Multilingual LLM
Paper • 2411.10083 • Published • 14 -
Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement
Paper • 2412.04003 • Published • 11
RAG
-
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
Paper • 2411.02959 • Published • 72 -
Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Paper • 2410.21242 • Published • 8 -
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution
Paper • 2501.05040 • Published • 15 -
Multi-task retriever fine-tuning for domain-specific and efficient RAG
Paper • 2501.04652 • Published • 10
long-context
-
Selecting Influential Samples for Long Context Alignment via Homologous Models' Guidance and Contextual Awareness Measurement
Paper • 2410.15633 • Published • 7 -
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Paper • 2411.13476 • Published • 16 -
LongKey: Keyphrase Extraction for Long Documents
Paper • 2411.17863 • Published • 12
Align
-
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment
Paper • 2410.13785 • Published • 19 -
Aligning Large Language Models via Self-Steering Optimization
Paper • 2410.17131 • Published • 23 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation
Paper • 2410.14745 • Published • 48
application
confidence
-
Deep Think with Confidence
Paper • 2508.15260 • Published • 81 -
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
Paper • 2508.12040 • Published • 14 -
InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States
Paper • 2406.12053 • Published -
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Paper • 2508.18076 • Published • 5
models
diffusion
reasoning
-
Large Language Models Can Self-Improve in Long-context Reasoning
Paper • 2411.08147 • Published • 67 -
Reverse Thinking Makes LLMs Stronger Reasoners
Paper • 2411.19865 • Published • 23 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 105
others
-
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Paper • 2410.17243 • Published • 95 -
StyleMaster: Stylize Your Video with Artistic Generation and Translation
Paper • 2412.07744 • Published • 20 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 155 -
Autoregressive Universal Video Segmentation Model
Paper • 2508.19242 • Published • 26
math
-
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Paper • 2410.13639 • Published • 19 -
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch
Paper • 2410.18693 • Published • 43 -
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Paper • 2412.03205 • Published • 16 -
Free Process Rewards without Process Labels
Paper • 2412.01981 • Published • 35
LLM-general
-
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper • 2410.10814 • Published • 52 -
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Paper • 2410.17215 • Published • 17 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models
Paper • 2410.18505 • Published • 11
RL infra
application
acceleration
confidence
-
Deep Think with Confidence
Paper • 2508.15260 • Published • 81 -
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
Paper • 2508.12040 • Published • 14 -
InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States
Paper • 2406.12053 • Published -
Neither Valid nor Reliable? Investigating the Use of LLMs as Judges
Paper • 2508.18076 • Published • 5
deepsearch
models
code
diffusion
multilingual
-
Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages
Paper • 2411.12240 • Published • 7 -
LLäMmlein: Compact and Competitive German-Only Language Models from Scratch
Paper • 2411.11171 • Published • 8 -
Xmodel-1.5: An 1B-scale Multilingual LLM
Paper • 2411.10083 • Published • 14 -
Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement
Paper • 2412.04003 • Published • 11
reasoning
-
Large Language Models Can Self-Improve in Long-context Reasoning
Paper • 2411.08147 • Published • 67 -
Reverse Thinking Makes LLMs Stronger Reasoners
Paper • 2411.19865 • Published • 23 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 105
RAG
-
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
Paper • 2411.02959 • Published • 72 -
Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Paper • 2410.21242 • Published • 8 -
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution
Paper • 2501.05040 • Published • 15 -
Multi-task retriever fine-tuning for domain-specific and efficient RAG
Paper • 2501.04652 • Published • 10
others
-
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Paper • 2410.17243 • Published • 95 -
StyleMaster: Stylize Your Video with Artistic Generation and Translation
Paper • 2412.07744 • Published • 20 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 155 -
Autoregressive Universal Video Segmentation Model
Paper • 2508.19242 • Published • 26
long-context
-
Selecting Influential Samples for Long Context Alignment via Homologous Models' Guidance and Contextual Awareness Measurement
Paper • 2410.15633 • Published • 7 -
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Paper • 2411.13476 • Published • 16 -
LongKey: Keyphrase Extraction for Long Documents
Paper • 2411.17863 • Published • 12
math
-
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Paper • 2410.13639 • Published • 19 -
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch
Paper • 2410.18693 • Published • 43 -
U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs
Paper • 2412.03205 • Published • 16 -
Free Process Rewards without Process Labels
Paper • 2412.01981 • Published • 35
Align
-
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment
Paper • 2410.13785 • Published • 19 -
Aligning Large Language Models via Self-Steering Optimization
Paper • 2410.17131 • Published • 23 -
Baichuan Alignment Technical Report
Paper • 2410.14940 • Published • 52 -
SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation
Paper • 2410.14745 • Published • 48
LLM-general
-
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper • 2410.10814 • Published • 52 -
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Paper • 2410.17215 • Published • 17 -
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Paper • 2410.16256 • Published • 61 -
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models
Paper • 2410.18505 • Published • 11