Update pipeline tag and project page URL
Browse filesThis PR improves the model card by:
* Updating the `pipeline_tag` from `text-generation` to `text-classification`, which more accurately reflects the model's core function of evaluating text against policies to classify compliance. This enhances discoverability for users looking for guardrail or text classification models on the Hugging Face Hub (https://huggingface.co/models?pipeline_tag=text-classification).
* Correcting the `project_page` URL in the metadata from the GitHub repository to the dedicated project page: `https://taruschirag.github.io/DynaGuard/`.
README.md
CHANGED
@@ -1,8 +1,12 @@
|
|
1 |
---
|
2 |
-
|
|
|
|
|
|
|
3 |
language: en
|
4 |
library_name: transformers
|
5 |
-
|
|
|
6 |
tags:
|
7 |
- guardrail
|
8 |
- safety
|
@@ -11,13 +15,9 @@ tags:
|
|
11 |
- umd
|
12 |
- qwen3
|
13 |
- llm
|
14 |
-
datasets:
|
15 |
-
- tomg-group-umd/DynaBench
|
16 |
-
base_model:
|
17 |
-
- Qwen/Qwen3-1.7B
|
18 |
repo_url: https://github.com/montehoover/DynaGuard
|
19 |
paper_url: https://arxiv.org/abs/2509.02563
|
20 |
-
project_page: https://github.
|
21 |
---
|
22 |
|
23 |
# DynaGuard-1.7B 🛡️
|
@@ -34,17 +34,17 @@ The DynaGuard series achieves state-of-the-art performance across a wide range o
|
|
34 |
|
35 |
## Model Details
|
36 |
|
37 |
-
*
|
38 |
-
*
|
39 |
-
*
|
40 |
-
*
|
41 |
-
*
|
42 |
|
43 |
### Key Features
|
44 |
|
45 |
-
*
|
46 |
-
*
|
47 |
-
*
|
48 |
1. **Fast Inference:** Provides a direct `PASS` or `FAIL` classification for minimal latency.
|
49 |
2. **Chain-of-Thought (CoT):** Generates a reasoning trace before giving the final classification, offering interpretability.
|
50 |
|
@@ -108,7 +108,8 @@ Evaluate the following dialogue for compliance with the given policy. Provide th
|
|
108 |
"""
|
109 |
inputs = tokenizer(fast_prompt, return_tensors="pt").to(model.device)
|
110 |
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.1)
|
111 |
-
print("
|
|
|
112 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
113 |
```
|
114 |
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- Qwen/Qwen3-1.7B
|
4 |
+
datasets:
|
5 |
+
- tomg-group-umd/DynaBench
|
6 |
language: en
|
7 |
library_name: transformers
|
8 |
+
license: apache-2.0
|
9 |
+
pipeline_tag: text-classification
|
10 |
tags:
|
11 |
- guardrail
|
12 |
- safety
|
|
|
15 |
- umd
|
16 |
- qwen3
|
17 |
- llm
|
|
|
|
|
|
|
|
|
18 |
repo_url: https://github.com/montehoover/DynaGuard
|
19 |
paper_url: https://arxiv.org/abs/2509.02563
|
20 |
+
project_page: https://taruschirag.github.io/DynaGuard/
|
21 |
---
|
22 |
|
23 |
# DynaGuard-1.7B 🛡️
|
|
|
34 |
|
35 |
## Model Details
|
36 |
|
37 |
+
* **Developed by:** University of Maryland, Capital One
|
38 |
+
* **Base Model:** [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B)
|
39 |
+
* **Model Type:** Decoder-only Transformer
|
40 |
+
* **Training Data:** Fine-tuned on a mixture of the **[DynaBench](https://huggingface.co/tomg-group-umd/DynaBench)** dataset and several safety benchmarks (WildGuard, BeaverTails, ToxicChat, Aegis 2.0).
|
41 |
+
* **Training Procedure:** The model was trained using Supervised Fine-Tuning (SFT) for one epoch, followed by GRPO.
|
42 |
|
43 |
### Key Features
|
44 |
|
45 |
+
* **Dynamic Policies:** Accepts arbitrary guardrail policies written in natural language, allowing for bespoke and application-specific moderation.
|
46 |
+
* **Interpretability:** Can generate detailed, natural-language explanations for why a policy was violated, enabling chatbot recovery and human-in-the-loop refinement.
|
47 |
+
* **Dual-Mode Inference:** Supports two modes for flexibility:
|
48 |
1. **Fast Inference:** Provides a direct `PASS` or `FAIL` classification for minimal latency.
|
49 |
2. **Chain-of-Thought (CoT):** Generates a reasoning trace before giving the final classification, offering interpretability.
|
50 |
|
|
|
108 |
"""
|
109 |
inputs = tokenizer(fast_prompt, return_tensors="pt").to(model.device)
|
110 |
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.1)
|
111 |
+
print("
|
112 |
+
--- Fast Inference Mode Output ---")
|
113 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
114 |
```
|
115 |
|