Post
248
I've tested many fine tunes. They were all getting lower scores than base in AHA.
Yesterday I found one fine tune (abliteration) which made the model go from 28 to 46: huihui-ai/Huihui-gpt-oss-120b-BF16-abliterated
Is there a correlation between censorship and being not human aligned?
Yesterday I found one fine tune (abliteration) which made the model go from 28 to 46: huihui-ai/Huihui-gpt-oss-120b-BF16-abliterated
Is there a correlation between censorship and being not human aligned?