Artifacts for paper "Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements" (https://arxiv.org/abs/2410.08968)
Jack Zhang
jackzhang
AI & ML interests
None yet
Recent Activity
updated
a dataset
10 days ago
jackzhang/nyt_texts_filtered_prompt_continuation
published
a dataset
10 days ago
jackzhang/nyt_texts_filtered_prompt_continuation
published
a model
12 days ago
jackzhang/newsspan.normalized.quip.100-1.bf
Organizations
Collections
1
models
4
datasets
14
jackzhang/nyt_texts_filtered_prompt_continuation
Viewer
•
Updated
•
28.4k
•
48
jackzhang/CoSApien
Viewer
•
Updated
•
200
•
33
jackzhang/V5-bt-wg-addr_imp-train
Viewer
•
Updated
•
122k
•
40
jackzhang/V4-bt_gpt-4o_wg-train
Viewer
•
Updated
•
133k
•
45
jackzhang/bt_7cat_test_400_unseencat
Viewer
•
Updated
•
1.2k
•
50
jackzhang/bt_7cat_5spec_testset_400
Viewer
•
Updated
•
2k
•
64
jackzhang/V2-given_sys-ah-train-no_em
Viewer
•
Updated
•
61.1k
•
51
jackzhang/bt_multi_4-V1-given_sys_combine-test
Viewer
•
Updated
•
3.45k
•
46
jackzhang/BeaverTails-dedupprompt_model-gpt-4o_harmful_cat_judge_clustercat_cot-improved
Viewer
•
Updated
•
34.2k
•
55
jackzhang/BeaverTails-dedupprompt_model-gpt-4-32k_harmful_cat_clustercat_cot-improved
Viewer
•
Updated
•
34.2k
•
46