abaryan
/

BioXP-0.5B-MedMCQA

Question Answering

Model card Files Files and versions Community

Abaryan commited on Jun 1

Commit

8802d8c

·

verified ·

1 Parent(s): 44ebb1e

Update README.md

Files changed (1) hide show

README.md +6 -1

README.md CHANGED Viewed

@@ -13,7 +13,12 @@ tags:
 # Model Card for BioXP
-This model is a 🤗 transformers model, BioXP-0.5B.
 ## Model Details

 # Model Card for BioXP
+BioXP-0.5B is a 🤗 Transformers-based model trained using our two-stage fine-tuning approach:
+1. Supervised Fine-Tuning (SFT): The model was initially fine-tuned on labeled data(MedMCQA) to achieve strong baseline accuracy on multiple-choice medical QA tasks.
+2. Group Relative Policy Optimization (GRPO): In the second stage, GRPO was applied to further align the model with human-like reasoning patterns.
+This reinforcement learning technique enhances the model’s ability to generate coherent, high-quality explanations and improve answer reliability.
 ## Model Details