Update pipeline tag and correct project page URL

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +17 -16
README.md CHANGED
@@ -1,8 +1,12 @@
1
  ---
2
- license: apache-2.0
 
 
 
3
  language: en
4
  library_name: transformers
5
- pipeline_tag: text-generation
 
6
  tags:
7
  - guardrail
8
  - safety
@@ -11,13 +15,9 @@ tags:
11
  - umd
12
  - qwen3
13
  - llm
14
- datasets:
15
- - tomg-group-umd/DynaBench
16
- base_model:
17
- - Qwen/Qwen3-8B
18
  repo_url: https://github.com/montehoover/DynaGuard
19
  paper_url: https://arxiv.org/abs/2509.02563
20
- project_page: https://github.com/taruschirag/DynaGuard
21
  ---
22
 
23
  # DynaGuard-8B 🛡️
@@ -34,17 +34,17 @@ The DynaGuard series achieves state-of-the-art performance across a wide range o
34
 
35
  ## Model Details
36
 
37
- * **Developed by:** University of Maryland, Capital One
38
- * **Base Model:** [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)
39
- * **Model Type:** Decoder-only Transformer
40
- * **Training Data:** Fine-tuned on a mixture of the **[DynaBench](https://huggingface.co/tomg-group-umd/DynaBench)** dataset and several safety benchmarks (WildGuard, BeaverTails, ToxicChat, Aegis 2.0).
41
- * **Training Procedure:** The model was trained using Supervised Fine-Tuning (SFT) for one epoch, followed by GRPO.
42
 
43
  ### Key Features
44
 
45
- * **Dynamic Policies:** Accepts arbitrary guardrail policies written in natural language, allowing for bespoke and application-specific moderation.
46
- * **Interpretability:** Can generate detailed, natural-language explanations for why a policy was violated, enabling chatbot recovery and human-in-the-loop refinement.
47
- * **Dual-Mode Inference:** Supports two modes for flexibility:
48
  1. **Fast Inference:** Provides a direct `PASS` or `FAIL` classification for minimal latency.
49
  2. **Chain-of-Thought (CoT):** Generates a reasoning trace before giving the final classification, offering interpretability.
50
 
@@ -108,7 +108,8 @@ Evaluate the following dialogue for compliance with the given policy. Provide th
108
  """
109
  inputs = tokenizer(fast_prompt, return_tensors="pt").to(model.device)
110
  outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.1)
111
- print("\n--- Fast Inference Mode Output ---")
 
112
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
113
  ```
114
 
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen3-8B
4
+ datasets:
5
+ - tomg-group-umd/DynaBench
6
  language: en
7
  library_name: transformers
8
+ license: apache-2.0
9
+ pipeline_tag: text-classification
10
  tags:
11
  - guardrail
12
  - safety
 
15
  - umd
16
  - qwen3
17
  - llm
 
 
 
 
18
  repo_url: https://github.com/montehoover/DynaGuard
19
  paper_url: https://arxiv.org/abs/2509.02563
20
+ project_page: https://taruschirag.github.io/DynaGuard/
21
  ---
22
 
23
  # DynaGuard-8B 🛡️
 
34
 
35
  ## Model Details
36
 
37
+ * **Developed by:** University of Maryland, Capital One
38
+ * **Base Model:** [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B)
39
+ * **Model Type:** Decoder-only Transformer
40
+ * **Training Data:** Fine-tuned on a mixture of the **[DynaBench](https://huggingface.co/tomg-group-umd/DynaBench)** dataset and several safety benchmarks (WildGuard, BeaverTails, ToxicChat, Aegis 2.0).
41
+ * **Training Procedure:** The model was trained using Supervised Fine-Tuning (SFT) for one epoch, followed by GRPO.
42
 
43
  ### Key Features
44
 
45
+ * **Dynamic Policies:** Accepts arbitrary guardrail policies written in natural language, allowing for bespoke and application-specific moderation.
46
+ * **Interpretability:** Can generate detailed, natural-language explanations for why a policy was violated, enabling chatbot recovery and human-in-the-loop refinement.
47
+ * **Dual-Mode Inference:** Supports two modes for flexibility:
48
  1. **Fast Inference:** Provides a direct `PASS` or `FAIL` classification for minimal latency.
49
  2. **Chain-of-Thought (CoT):** Generates a reasoning trace before giving the final classification, offering interpretability.
50
 
 
108
  """
109
  inputs = tokenizer(fast_prompt, return_tensors="pt").to(model.device)
110
  outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.1)
111
+ print("
112
+ --- Fast Inference Mode Output ---")
113
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
114
  ```
115