s1lv3rj1nx
/

ch1

s1lv3rj1nx commited on 26 days ago

Commit

b3d0ec2

verified ·

1 Parent(s): 262a2b6

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -6,4 +6,29 @@ language:
 - en
 tags:
 - translation
----

 - en
 tags:
 - translation
+---
+This is the trained model file for `Ch1 - Attention is all you need`. This chapter creates a transformer from scratch for `English` to `Hindi` translation. Please use any of the checkpoints for inference.
+Loss Graph:
+![image.png](https://cdn-uploads.huggingface.co/production/uploads/62790519541f3d2dfa79a6cb/8_J-C6FItlpHxQpihw-NN.png)
+Training specs: Trained on Nvidia A10 GPU (24G) for 12hrs.
+```json
+return {
+'batch_size': 85,
+'num_samples': 1000000,
+'num_epochs': 10,
+'lr': 10**-4,
+'seq_len': 128,
+'d_model': 512,
+'datasource': "runs",
+'tgt_language': 'hi',
+'model_folder': 'weights',
+'model_basename': 'tmodel_',
+'preload': None,
+'tokenizer_folder': 'tokenizer',
+'vocab_size': 52000,
+}
+```