Shukai Liu

skLiu

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

updated a dataset 5 days ago

m-a-p/MdEval-Instruct

published a dataset 5 days ago

m-a-p/MdEval-Instruct

View all activity

Organizations

skLiu's activity

upvoted a paper 3 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 3 days ago • 88

updated a dataset 5 days ago

m-a-p/MdEval-Instruct

Viewer • Updated 5 days ago • 16k • 14 • 1

published a dataset 5 days ago

m-a-p/MdEval-Instruct

Viewer • Updated 5 days ago • 16k • 14 • 1

updated a collection 5 days ago

MdEval

Collection

MdEval: Massively Multilingual Code Debugging • 2 items • Updated 5 days ago

updated a dataset 5 days ago

Multilingual-Multimodal-NLP/MdEval-Instruct

Viewer • Updated 5 days ago • 16k • 14

published a dataset 5 days ago

Multilingual-Multimodal-NLP/MdEval-Instruct

Viewer • Updated 5 days ago • 16k • 14

updated a dataset 12 days ago

m-a-p/MdEval

Preview • Updated 12 days ago • 50 • 2

liked a dataset 12 days ago

m-a-p/MdEval

Preview • Updated 12 days ago • 50 • 2

updated a dataset 12 days ago

Multilingual-Multimodal-NLP/MdEval

Preview • Updated 12 days ago • 65 • 3

upvoted a paper 2 months ago

Evaluating and Aligning CodeLLMs on Human Preference

Paper • 2412.05210 • Published Dec 6, 2024 • 47

liked a dataset 3 months ago

Multilingual-Multimodal-NLP/MdEval

Preview • Updated 12 days ago • 65 • 3

upvoted a paper 4 months ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 115

authored a paper 4 months ago

M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation

Paper • 2410.21157 • Published Oct 28, 2024 • 6

upvoted 2 papers 6 months ago

mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding

Paper • 2409.03420 • Published Sep 5, 2024 • 26

FuzzCoder: Byte-level Fuzzing Test via Large Language Model

Paper • 2409.01944 • Published Sep 3, 2024 • 45

liked a dataset 6 months ago

Rtian/DebugBench

Viewer • Updated Jan 11, 2024 • 4.25k • 337 • 24

upvoted a paper 6 months ago

TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52

liked a Space 7 months ago

12.6k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots