Sashavav
/

Translator

Model card Files Files and versions

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Translator

This is a research project to create a model that can work with text

How to launch in docker environment

How to launch in your environment

Clone repository
Install dependencies by

pip install poetry && poetry install

Run code

from Translator import Writer
writer = Writer.from_pretrained() #  .to("cuda")
print(writer(input_seq="One day I saw a ", temperature=2))  # I highly recommend high temperature

Model architecture and training pipeline

Transformer decoder architecture with params:

decoder blocks = 4
vocab size = 8192
embedding_size = 512
number of heads = 8
hidden size in FFN = 1024
max_sequence_length = 128

Trained with params:

loss = CrossEntropyLoss
optimizer = Adam
batch = 400
accumulation steps = 3
epochs = 10
nums of sequences in dataset = 21kk

Total training time: 10 hours

Sources

Architecture inspired from Attention Is All You Need
Dataset

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support