Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Catherine Arnett's picture
3 5 7

Catherine Arnett

catherinearnett
stefan-it's profile picture pkd's profile picture genesith's profile picture
·
https://catherinearnett.github.io/
  • linguist_cat
  • catherinearnett
  • catherinearnett.bsky.social

AI & ML interests

multilingual NLP, tokenization

Recent Activity

updated a dataset 6 days ago
catherinearnett/eng_montok
published a dataset 6 days ago
catherinearnett/eng_montok
authored a paper about 1 month ago
BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization
View all activity

Organizations

Blog-explorers's profile picture Language and Cognition Lab (UCSD)'s profile picture

catherinearnett 's collections 1

B-GPT
Bilingual GPT-2 models with checkpoints
  • catherinearnett/B-GPT_en_nl_simultaneous

    Text Generation • 0.1B • Updated Jun 12 • 2.79k
  • catherinearnett/B-GPT_nl_en_simultaneous

    Text Generation • 0.1B • Updated Jun 12 • 325
  • catherinearnett/B-GPT_en_nl_sequential

    Text Generation • 0.1B • Updated Jun 12 • 132
  • catherinearnett/B-GPT_nl_en_sequential

    Text Generation • 0.1B • Updated Jun 12 • 196
B-GPT
Bilingual GPT-2 models with checkpoints
  • catherinearnett/B-GPT_en_nl_simultaneous

    Text Generation • 0.1B • Updated Jun 12 • 2.79k
  • catherinearnett/B-GPT_nl_en_simultaneous

    Text Generation • 0.1B • Updated Jun 12 • 325
  • catherinearnett/B-GPT_en_nl_sequential

    Text Generation • 0.1B • Updated Jun 12 • 132
  • catherinearnett/B-GPT_nl_en_sequential

    Text Generation • 0.1B • Updated Jun 12 • 196
Company
TOS Privacy About Jobs
Website
Models Datasets OCR模型免费转Markdown Pricing 模型下载攻略