Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
11
DeMarcus Edwards
djmcflush
Follow
0 followers
ยท
3 following
https://www.darelabs.xyz
djmcflush
djmcflush
AI & ML interests
Adversarial Machine learning, Robotics, Vision, NLP, Graph theory
Recent Activity
liked
a dataset
14 days ago
saiyan-world/Goku-MovieGenBench
updated
a model
3 months ago
djmcflush/LLNL_LLAMA
reacted
to
osanseviero
's
post
with ๐
about 1 year ago
Mixture of experts: beware ๐ก๏ธโ๏ธ New paper by DeepMind: Buffer Overflow in MoE https://huggingface.co/papers/2402.05526 The paper shows an adversarial attack strategy in which a user sends malicious queries that can affect the output of other user queries from the same batch. So if in the same batch we have - User A benign query - User B malicious query The response for A might be altered!๐ฑ How is this possible? One approach is to fill the token buffers with adversarial data, hence forcing the gating to use the non-ideal experts or to entirely drop the bening tokens (in the case of finite limit size). This assumes that the adversary can use the model as a black-box but can observe the logit outputs + ensure that the data is always grouped in the same batch. How to mitigate this? - Randomize batch order (and even run twice if some queries are very sensitive) - Use a large capacity slack - Sample from gate weights instead of top-k (not great IMO, as that require more memory for inference) Very cool paper!!
View all activity
Organizations
None yet
models
2
Sort:ย Recently updated
djmcflush/xtts-v2
Updated
Nov 26, 2024
djmcflush/LLNL_LLAMA
Updated
Nov 12, 2024
datasets
None public yet