Hynek Kydlicek's picture

Hynek Kydlicek

hynky

·

AI & ML interests

Data-processing

Recent Activity

new activity 10 days ago

HuggingFaceFW/finepdfs:Which language detector did you use

new activity 13 days ago

HuggingFaceFW/finepdfs:The "file_path" data field appears to primarily contain cc-index paths rather than WARC paths.

new activity 13 days ago

HuggingFaceFW/finepdfs:A Few Questions About the Implementation Details of the finepdfs Project

View all activity

Organizations

Articles 8

Article

281

Supercharge your OCR Pipelines with Open Models

Article

32

FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages

View all Articles

Collections 4

View 4 collections

Papers 5

arxiv:2506.20920

arxiv:2502.02737

arxiv:2501.08365

arxiv:2406.17557

spaces 6

Open LLM Leaderboard

Track, rank and evaluate open LLMs and chatbots

Static Test

Display evaluation tasks for scaling FineWeb

Test

Create a static web page by editing HTML

CZ-EVAL

News Classification

Sklearn Proxy

models 9

hynky/Llama-3.2-1B-no-bos

hynky/codellama-7b-sft-lora-func-names-java-4bit

Updated Jan 9, 2024 • 8

hynky/Day_of_week

Text Classification • Updated Jan 2, 2024 • 17

hynky/codellama-7b-sft-lora-func-names-4bit

Updated Dec 28, 2023 • 4

hynky/codellama-7b-sft-lora-func-names

Text Generation • Updated Dec 28, 2023 • 18

hynky/Gender

Text Classification • Updated Dec 17, 2023 • 19

hynky/Category

Text Classification • Updated Dec 17, 2023 • 15

hynky/Server

Text Classification • Updated Dec 17, 2023 • 14 • 1

hynky/hyneczech-base

Fill-Mask • Updated Mar 30, 2023 • 11

datasets 21

hynky/docling-issues

Viewer • Updated Jun 16 • 63 • 31

hynky/ioi-leaderboard

Viewer • Updated Feb 28 • 2 • 17

hynky/NuminaMath-correctness-v3

Viewer • Updated Feb 4 • 27.7k • 33

hynky/NuminaMath-R1-correctness

Viewer • Updated Jan 27 • 800 • 30

hynky/thai_exam_zip

Viewer • Updated Jul 18, 2024 • 590 • 142

hynky/mmlu_okapi

Preview • Updated Jul 11, 2024 • 220

hynky/okapi_arc_challenge

Viewer • Updated Jun 28, 2024 • 23.6k • 661

hynky/czech_news_dataset_v2

Viewer • Updated Jun 20, 2024 • 1.93M • 961 • 3

hynky/klokan-qa

Viewer • Updated May 6, 2024 • 4.05k • 145 • 3

hynky/datatrove-test-1-shard

Viewer • Updated May 5, 2024 • 5 • 20

View 21 datasets