Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
3
5
7
Catherine Arnett
catherinearnett
Follow
privru's profile picture
stefan-it's profile picture
lunarflu's profile picture
41 followers
·
22 following
https://catherinearnett.github.io/
linguist_cat
catherinearnett
catherinearnett.bsky.social
AI & ML interests
multilingual NLP, tokenization
Recent Activity
updated
a dataset
7 days ago
catherinearnett/eng_montok
published
a dataset
7 days ago
catherinearnett/eng_montok
authored
a paper
about 1 month ago
BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization
View all activity
Organizations
catherinearnett
's datasets
2
Sort: Recently updated
catherinearnett/eng_montok
Updated
7 days ago
•
70
catherinearnett/morphscore
Viewer
•
Updated
Jul 10
•
5.09M
•
365
•
1