dd45babb-0c73-48fc-8094-eebfde4df372

This model is a fine-tuned version of lcw99/zephykor-ko-7b-chang on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.000202
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
lr_scheduler_warmup_steps: 50
training_steps: 400

Training Loss	Epoch	Step	Validation Loss
No log	0.0001	1	4.4533
10.7481	0.0056	50	2.5631
9.4069	0.0113	100	2.4389
9.64	0.0169	150	2.3673
9.6174	0.0226	200	2.3359
9.1584	0.0282	250	2.3844
9.1216	0.0339	300	2.3883
8.7793	0.0395	350	2.4921