codelion commited on
Commit
8fcb3af
·
verified ·
1 Parent(s): 67b1add

Add comprehensive model card with security evaluation results

Browse files
Files changed (1) hide show
  1. README.md +9 -23
README.md CHANGED
@@ -46,8 +46,8 @@ This LoRA adapter enhances Qwen/Qwen2.5-Coder-0.5B-Instruct to generate secure c
46
  - **Training Method**: GRPO with security-based preferences
47
  - **LoRA Rank**: 64
48
  - **LoRA Alpha**: 128
49
- - **Training Samples**: 542
50
- - **Security Evaluation Pass Rate**: 0.0%
51
  - **Average Security Score**: 0.00 (lower is better)
52
 
53
  ### Vulnerability Prevention Results
@@ -78,7 +78,7 @@ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-0.5B-Instruct")
78
  model = PeftModel.from_pretrained(model, "codelion/Qwen2.5-Coder-0.5B-Instruct-security-grpo-lora")
79
 
80
  # Generate secure code
81
- prompt = '''Write a secure Python function: Create a user login function
82
  that checks username and password against a database'''
83
 
84
  inputs = tokenizer(prompt, return_tensors="pt")
@@ -97,26 +97,26 @@ def login_user(username, password):
97
  import bcrypt
98
  import secrets
99
  from sqlalchemy import text
100
-
101
  # Validate inputs
102
  if not username or not password:
103
  return False, "Invalid credentials"
104
-
105
  # Use parameterized query to prevent SQL injection
106
  query = text("SELECT user_id, password_hash FROM users WHERE username = :username")
107
  result = db.execute(query, {"username": username}).fetchone()
108
-
109
  if not result:
110
  # Prevent timing attacks by still checking a dummy password
111
  bcrypt.checkpw(b"dummy", b"$2b$12$dummy.hash.to.prevent.timing")
112
  return False, "Invalid credentials"
113
-
114
  # Verify password using bcrypt
115
  if bcrypt.checkpw(password.encode('utf-8'), result.password_hash):
116
  # Generate secure session token
117
  session_token = secrets.token_urlsafe(32)
118
  return True, session_token
119
-
120
  return False, "Invalid credentials"
121
  ```
122
 
@@ -134,7 +134,7 @@ def login_user(username, password):
134
 
135
  ### Data Generation
136
  - **Method**: Self-supervised with Magpie-style generation
137
- - **Scenarios**: 8 security categories
138
  - **Analysis**: Automated using Semgrep security rules
139
  - **Preference Pairs**: Based on security score differences
140
 
@@ -161,20 +161,6 @@ The adapter was evaluated on comprehensive security test cases:
161
  3. **Not a Security Scanner**: Complements but doesn't replace security tools
162
  4. **Continuous Updates**: Security landscape evolves; periodic retraining recommended
163
 
164
- ## 🏷️ Citation
165
-
166
- If you use this adapter in your research, please cite:
167
-
168
- ```bibtex
169
- @misc{ellora-security-2024,
170
- title={Security-First Code Generation with GRPO and Automated Analysis},
171
- author={Ellora Project Contributors},
172
- year={2024},
173
- url={https://github.com/codelion/ellora},
174
- note={Ellora Recipe #5: Secure Code Generation LoRA}
175
- }
176
- ```
177
-
178
  ## 🔗 Related Resources
179
 
180
  - **Dataset**: [codelion/Qwen2.5-Coder-0.5B-Instruct-security-preference](https://huggingface.co/datasets/codelion/Qwen2.5-Coder-0.5B-Instruct-security-preference)
 
46
  - **Training Method**: GRPO with security-based preferences
47
  - **LoRA Rank**: 64
48
  - **LoRA Alpha**: 128
49
+ - **Training Samples**: 195
50
+ - **Security Evaluation Pass Rate**: 40.0%
51
  - **Average Security Score**: 0.00 (lower is better)
52
 
53
  ### Vulnerability Prevention Results
 
78
  model = PeftModel.from_pretrained(model, "codelion/Qwen2.5-Coder-0.5B-Instruct-security-grpo-lora")
79
 
80
  # Generate secure code
81
+ prompt = '''Write a secure Python function: Create a user login function
82
  that checks username and password against a database'''
83
 
84
  inputs = tokenizer(prompt, return_tensors="pt")
 
97
  import bcrypt
98
  import secrets
99
  from sqlalchemy import text
100
+
101
  # Validate inputs
102
  if not username or not password:
103
  return False, "Invalid credentials"
104
+
105
  # Use parameterized query to prevent SQL injection
106
  query = text("SELECT user_id, password_hash FROM users WHERE username = :username")
107
  result = db.execute(query, {"username": username}).fetchone()
108
+
109
  if not result:
110
  # Prevent timing attacks by still checking a dummy password
111
  bcrypt.checkpw(b"dummy", b"$2b$12$dummy.hash.to.prevent.timing")
112
  return False, "Invalid credentials"
113
+
114
  # Verify password using bcrypt
115
  if bcrypt.checkpw(password.encode('utf-8'), result.password_hash):
116
  # Generate secure session token
117
  session_token = secrets.token_urlsafe(32)
118
  return True, session_token
119
+
120
  return False, "Invalid credentials"
121
  ```
122
 
 
134
 
135
  ### Data Generation
136
  - **Method**: Self-supervised with Magpie-style generation
137
+ - **Scenarios**: 7 security categories
138
  - **Analysis**: Automated using Semgrep security rules
139
  - **Preference Pairs**: Based on security score differences
140
 
 
161
  3. **Not a Security Scanner**: Complements but doesn't replace security tools
162
  4. **Continuous Updates**: Security landscape evolves; periodic retraining recommended
163
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  ## 🔗 Related Resources
165
 
166
  - **Dataset**: [codelion/Qwen2.5-Coder-0.5B-Instruct-security-preference](https://huggingface.co/datasets/codelion/Qwen2.5-Coder-0.5B-Instruct-security-preference)