Model Card for Model ID

The model was created for the task of tweet classification with 3 classes: positive tweet, neutral or negative. It is an adapter for the default TinyLlama/TinyLlama-1.1B-Chat-v1.0 and it improves f1-score on the task from 0.21 to 0.51.

Training Details

Training Data

The model was trained on cardiffnlp/tweet_eval that was created exactly for the given task -- tweet classification.

Training Procedure

The model was trained for 3 epochs, standard number for training the adapters. LR was 1e-4, batch size was the maximum possible 24. The rank for the matricies was standard 8, alpha -- 16. AdamW was used as the optimizer. DoRA layers were adapted only for v_proj and k_proj layers.

Results

f1 score was improved from 0.21 to 0.51 in only 3 epochs and with updating only ~0.15% weights of the model. Cool!

Downloads last month
5
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support