Update `pipeline_tag` and correct `project_page` URL in metadata
Browse filesThis PR improves the model card metadata by:
- Changing the `pipeline_tag` from `text-generation` to `text-classification`. This more accurately reflects the model's core functionality of evaluating text for policy compliance and providing a classification (PASS/FAIL), ensuring better discoverability at https://huggingface.co/models?pipeline_tag=text-classification.
- Correcting the `project_page` URL from `https://github.com/taruschirag/DynaGuard` to `https://taruschirag.github.io/DynaGuard/` to point to the official project website, ensuring consistency and accuracy.
No changes are made to the markdown content as it is already comprehensive and well-structured.
README.md
CHANGED
@@ -1,8 +1,12 @@
|
|
1 |
---
|
2 |
-
|
|
|
|
|
|
|
3 |
language: en
|
4 |
library_name: transformers
|
5 |
-
|
|
|
6 |
tags:
|
7 |
- guardrail
|
8 |
- safety
|
@@ -11,13 +15,9 @@ tags:
|
|
11 |
- umd
|
12 |
- qwen3
|
13 |
- llm
|
14 |
-
datasets:
|
15 |
-
- tomg-group-umd/DynaBench
|
16 |
-
base_model:
|
17 |
-
- Qwen/Qwen3-4B
|
18 |
repo_url: https://github.com/montehoover/DynaGuard
|
19 |
paper_url: https://arxiv.org/abs/2509.02563
|
20 |
-
project_page: https://github.
|
21 |
---
|
22 |
|
23 |
# DynaGuard-4B 🛡️
|
@@ -35,17 +35,17 @@ The DynaGuard series achieves state-of-the-art performance across a wide range o
|
|
35 |
|
36 |
## Model Details
|
37 |
|
38 |
-
*
|
39 |
-
*
|
40 |
-
*
|
41 |
-
*
|
42 |
-
*
|
43 |
|
44 |
### Key Features
|
45 |
|
46 |
-
*
|
47 |
-
*
|
48 |
-
*
|
49 |
1. **Fast Inference:** Provides a direct `PASS` or `FAIL` classification for minimal latency.
|
50 |
2. **Chain-of-Thought (CoT):** Generates a reasoning trace before giving the final classification, offering interpretability.
|
51 |
|
@@ -109,7 +109,8 @@ Evaluate the following dialogue for compliance with the given policy. Provide th
|
|
109 |
"""
|
110 |
inputs = tokenizer(fast_prompt, return_tensors="pt").to(model.device)
|
111 |
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.1)
|
112 |
-
print("
|
|
|
113 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
114 |
```
|
115 |
|
|
|
1 |
---
|
2 |
+
base_model:
|
3 |
+
- Qwen/Qwen3-4B
|
4 |
+
datasets:
|
5 |
+
- tomg-group-umd/DynaBench
|
6 |
language: en
|
7 |
library_name: transformers
|
8 |
+
license: apache-2.0
|
9 |
+
pipeline_tag: text-classification
|
10 |
tags:
|
11 |
- guardrail
|
12 |
- safety
|
|
|
15 |
- umd
|
16 |
- qwen3
|
17 |
- llm
|
|
|
|
|
|
|
|
|
18 |
repo_url: https://github.com/montehoover/DynaGuard
|
19 |
paper_url: https://arxiv.org/abs/2509.02563
|
20 |
+
project_page: https://taruschirag.github.io/DynaGuard/
|
21 |
---
|
22 |
|
23 |
# DynaGuard-4B 🛡️
|
|
|
35 |
|
36 |
## Model Details
|
37 |
|
38 |
+
* **Developed by:** University of Maryland, Capital One
|
39 |
+
* **Base Model:** [Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B)
|
40 |
+
* **Model Type:** Decoder-only Transformer
|
41 |
+
* **Training Data:** Fine-tuned on a mixture of the **[DynaBench](https://huggingface.co/tomg-group-umd/DynaBench)** dataset and several safety benchmarks (WildGuard, BeaverTails, ToxicChat, Aegis 2.0).
|
42 |
+
* **Training Procedure:** The model was trained using Supervised Fine-Tuning (SFT) for one epoch, followed by GRPO.
|
43 |
|
44 |
### Key Features
|
45 |
|
46 |
+
* **Dynamic Policies:** Accepts arbitrary guardrail policies written in natural language, allowing for bespoke and application-specific moderation.
|
47 |
+
* **Interpretability:** Can generate detailed, natural-language explanations for why a policy was violated, enabling chatbot recovery and human-in-the-loop refinement.
|
48 |
+
* **Dual-Mode Inference:** Supports two modes for flexibility:
|
49 |
1. **Fast Inference:** Provides a direct `PASS` or `FAIL` classification for minimal latency.
|
50 |
2. **Chain-of-Thought (CoT):** Generates a reasoning trace before giving the final classification, offering interpretability.
|
51 |
|
|
|
109 |
"""
|
110 |
inputs = tokenizer(fast_prompt, return_tensors="pt").to(model.device)
|
111 |
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.1)
|
112 |
+
print("
|
113 |
+
--- Fast Inference Mode Output ---")
|
114 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
115 |
```
|
116 |
|