ritabratamaiti commited on
Commit
1bbb46d
·
verified ·
1 Parent(s): 0c17114

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -11,12 +11,14 @@ language:
11
  - en
12
  ---
13
 
 
 
14
  # Uploaded model
15
 
16
  - **Developed by:** Xilabs
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** unsloth/phi-4-bnb-4bit
19
 
20
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
11
  - en
12
  ---
13
 
14
+ We present MolRex, a reinforcement learning framework that combines Group Relative Policy Optimization (GRPO) with chain-of-thought fine-tuning of large language models (LLMs) to improve molecular structures through guided reasoning. MolRex trains models to propose chemically valid structural edits along with interpretable rationales, optimizing responses based on a composite reward signal that includes synthesizability, drug-likeness, human-aligned molecular preferences, and format validity. While additional metrics such as reasoning brevity are implemented for future integration, current training prioritizes chemically meaningful and syntactically robust outputs. By leveraging relative comparisons between candidate generations instead of absolute value estimation, MolRex facilitates stable training and avoids the complexity of critic networks. Experimental results show that MolRex enhances molecular properties while offering transparent rationales, making it a promising step toward interpretable, reasoning-augmented molecular design.
15
+
16
  # Uploaded model
17
 
18
  - **Developed by:** Xilabs
19
  - **License:** apache-2.0
20
  - **Finetuned from model :** unsloth/phi-4-bnb-4bit
21
 
22
+ This model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
23
 
24
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)