Damar Jati ๐ซ
DamarJati
AI & ML interests
Indonesian - Multimodal, Compvis, NLP |
Discord: @damarjati_
Recent Activity
updated
a Space
2 days ago
tensor-diffusion/README
updated
a model
2 days ago
tensor-diffusion/sd-embeddings
published
a model
2 days ago
tensor-diffusion/real-not-real
Organizations
DamarJati's activity
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
clem's
post with ๐ฅ
2 days ago
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
ZennyKenny's
post with ๐ฅ
11 days ago
Post
3414
I've completed the first unit of the just-launched Hugging Face Agents Course. I would highly recommend it, even for experienced builders, because it is a great walkthrough of the smolagents library and toolkit.
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
tomaarsen's
post with ๐ฅ
12 days ago
Post
7073
๐ฃ Sentence Transformers v3.2.0 is out, marking the biggest release for inference in 2 years! 2 new backends for embedding models: ONNX (+ optimization & quantization) and OpenVINO, allowing for speedups up to 2x-3x AND Static Embeddings for 500x speedups at 10-20% accuracy cost.
1๏ธโฃ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference.
2๏ธโฃ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU.
Usage is as simple as
๐ Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways:
1๏ธโฃ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with
2๏ธโฃ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU.
Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0
Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html
1๏ธโฃ ONNX Backend: This backend uses the ONNX Runtime to accelerate model inference on both CPU and GPU, reaching up to 1.4x-3x speedup depending on the precision. We also introduce 2 helper methods for optimizing and quantizing models for (much) faster inference.
2๏ธโฃ OpenVINO Backend: This backend uses Intel their OpenVINO instead, outperforming ONNX in some situations on CPU.
Usage is as simple as
SentenceTransformer("all-MiniLM-L6-v2", backend="onnx")
. Does your model not have an ONNX or OpenVINO file yet? No worries - it'll be autoexported for you. Thank me later ๐๐ Another major new feature is Static Embeddings: think word embeddings like GLoVe and word2vec, but modernized. Static Embeddings are bags of token embeddings that are summed together to create text embeddings, allowing for lightning-fast embeddings that don't require any neural networks. They're initialized in one of 2 ways:
1๏ธโฃ via Model2Vec, a new technique for distilling any Sentence Transformer models into static embeddings. Either via a pre-distilled model with
from_model2vec
or with from_distillation
where you do the distillation yourself. It'll only take 5 seconds on GPU & 2 minutes on CPU, no dataset needed.2๏ธโฃ Random initialization. This requires finetuning, but finetuning is extremely quick (e.g. I trained with 3 million pairs in 7 minutes). My final model was 6.6% worse than bge-base-en-v1.5, but 500x faster on CPU.
Full release notes: https://github.com/UKPLab/sentence-transformers/releases/tag/v3.2.0
Documentation on Speeding up Inference: https://sbert.net/docs/sentence_transformer/usage/efficiency.html
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
m-ric's
post with ๐ฅ
12 days ago
Post
3240
Today we make the biggest release in smolagents so far: ๐๐ฒ ๐ฒ๐ป๐ฎ๐ฏ๐น๐ฒ ๐๐ถ๐๐ถ๐ผ๐ป ๐บ๐ผ๐ฑ๐ฒ๐น๐, ๐๐ต๐ถ๐ฐ๐ต ๐ฎ๐น๐น๐ผ๐๐ ๐๐ผ ๐ฏ๐๐ถ๐น๐ฑ ๐ฝ๐ผ๐๐ฒ๐ฟ๐ณ๐๐น ๐๐ฒ๐ฏ ๐ฏ๐ฟ๐ผ๐๐๐ถ๐ป๐ด ๐ฎ๐ด๐ฒ๐ป๐๐! ๐ฅณ
Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.
The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !
Go try it out, it's the most cracked agentic stuff I've seen in a while ๐คฏ (well, along with OpenAI's Operator who beat us by one day)
For more detail, read our announcement blog ๐ https://huggingface.co/blog/smolagents-can-see
The code for the web browser example is here ๐ https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py
Our agents can now casually open up a web browser, and navigate on it by scrolling, clicking elements on the webpage, going back, just like a user would.
The demo below shows Claude-3.5-Sonnet browsing GitHub for task: "Find how many commits the author of the current top trending repo did over last year."
Hi @mlabonne !
Go try it out, it's the most cracked agentic stuff I've seen in a while ๐คฏ (well, along with OpenAI's Operator who beat us by one day)
For more detail, read our announcement blog ๐ https://huggingface.co/blog/smolagents-can-see
The code for the web browser example is here ๐ https://github.com/huggingface/smolagents/blob/main/examples/vlm_web_browser.py
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
posted
an
update
about 2 months ago
Post
2852
Happy New Year 2025 ๐ค
For the Huggingface community.
For the Huggingface community.
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
victor's
post with ๐ฅ
6 months ago
Post
5752
๐ Calling all Hugging Face users! We want to hear from YOU!
What feature or improvement would make the biggest impact on Hugging Face?
Whether it's the Hub, better documentation, new integrations, or something completely different โ we're all ears!
Your feedback shapes the future of Hugging Face. Drop your ideas in the comments below! ๐
What feature or improvement would make the biggest impact on Hugging Face?
Whether it's the Hub, better documentation, new integrations, or something completely different โ we're all ears!
Your feedback shapes the future of Hugging Face. Drop your ideas in the comments below! ๐
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
posted
an
update
6 months ago
Post
4234
Improved ControlNet!
Now supports dynamic resolution for perfect landscape and portrait outputs. Generate stunning images without distortionโoptimized for any aspect ratio!
...
DamarJati/FLUX.1-DEV-Canny
Now supports dynamic resolution for perfect landscape and portrait outputs. Generate stunning images without distortionโoptimized for any aspect ratio!
...
DamarJati/FLUX.1-DEV-Canny
Wow, this has quite a short processing time.
Awesome!
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
KingNish's
post with ๐ฅ
10 months ago
Post
2642
Introducing JARVIS Tony's voice assistant for You.
JARVIS responds to all your questions in audio format.
Must TRY -> KingNish/JARVIS
Jarvis is currently equipped to accept text input and provide audio output.
In the future, it may also support audio input.
DEMO Video:
JARVIS responds to all your questions in audio format.
Must TRY -> KingNish/JARVIS
Jarvis is currently equipped to accept text input and provide audio output.
In the future, it may also support audio input.
DEMO Video:
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
dhuynh95's
post with ๐คฏ
11 months ago
Post
Hello World! This post is written by the Large Action Model framework LaVague! Find out more on https://github.com/mithril-security/LaVague
Edit: Here is the video of ๐LaVague posting this. This is quite meta
Edit: Here is the video of ๐LaVague posting this. This is quite meta
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
macadeliccc's
post with ๐
about 1 year ago
Post
Benefits of
Quip-# is a quantization method proposed by [Cornell-RelaxML](https://github.com/Cornell-RelaxML) that claims tremendous performance gains using only 2-bit precision.
RelaxML proposes that quantizing a model from 16 bit to 2 bit precision they can utilize Llama-2-70B on a single 24GB GPU.
QuIP# aims to revolutionize model quantization through a blend of incoherence processing and advanced lattice codebooks. By switching to a Hadamard transform-based incoherence approach, QuIP# enhances GPU efficiency, making weight matrices more Gaussian-like and ideal for quantization with its improved lattice codebooks.
This new method has already seen some adoption by projects like llama.cpp. The use of the Quip-# methodology has been implemented in the form of imatrix calculations. The importance matrix is calculated from a dataset such as wiki.train.raw and will output the perplexity on the given dataset.
This interim step can improve the results of the quantized model. If you would like to explore this process for yourself:
llama.cpp - https://github.com/ggerganov/llama.cpp/
Quip# paper - https://cornell-relaxml.github.io/quip-sharp/
AutoQuip# colab - https://colab.research.google.com/drive/1rPDvcticCekw8VPNjDbh_UcivVBzgwEW?usp=sharing
Other impressive quantization projects to watch:
+ AQLM
https://github.com/Vahe1994/AQLM
https://arxiv.org/abs/2401.06118
imatrix
quantization in place of quip#Quip-# is a quantization method proposed by [Cornell-RelaxML](https://github.com/Cornell-RelaxML) that claims tremendous performance gains using only 2-bit precision.
RelaxML proposes that quantizing a model from 16 bit to 2 bit precision they can utilize Llama-2-70B on a single 24GB GPU.
QuIP# aims to revolutionize model quantization through a blend of incoherence processing and advanced lattice codebooks. By switching to a Hadamard transform-based incoherence approach, QuIP# enhances GPU efficiency, making weight matrices more Gaussian-like and ideal for quantization with its improved lattice codebooks.
This new method has already seen some adoption by projects like llama.cpp. The use of the Quip-# methodology has been implemented in the form of imatrix calculations. The importance matrix is calculated from a dataset such as wiki.train.raw and will output the perplexity on the given dataset.
This interim step can improve the results of the quantized model. If you would like to explore this process for yourself:
llama.cpp - https://github.com/ggerganov/llama.cpp/
Quip# paper - https://cornell-relaxml.github.io/quip-sharp/
AutoQuip# colab - https://colab.research.google.com/drive/1rPDvcticCekw8VPNjDbh_UcivVBzgwEW?usp=sharing
Other impressive quantization projects to watch:
+ AQLM
https://github.com/Vahe1994/AQLM
https://arxiv.org/abs/2401.06118
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
alielfilali01's
post with ๐
about 1 year ago
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
Xenova's
post with ๐คฏโค๏ธ
about 1 year ago
Post
Introducing Remove Background Web: In-browser background removal, powered by
@briaai
's new RMBG-v1.4 model and ๐ค Transformers.js!
Everything runs 100% locally, meaning none of your images are uploaded to a server! ๐คฏ At only ~45MB, the 8-bit quantized version of the model is perfect for in-browser usage (it even works on mobile).
Check it out! ๐
Demo: Xenova/remove-background-web
Model: briaai/RMBG-1.4
Everything runs 100% locally, meaning none of your images are uploaded to a server! ๐คฏ At only ~45MB, the 8-bit quantized version of the model is perfect for in-browser usage (it even works on mobile).
Check it out! ๐
Demo: Xenova/remove-background-web
Model: briaai/RMBG-1.4
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
fffiloni's
post with โค๏ธ
about 1 year ago
Post
Quick build of the day: LCM Supa Fast Image Variation
โ
We take the opportunity to combine moondream1 vision and LCM SDXL fast abilities to generate a variation from the subject of the image input.
All that thanks to gradio APIs ๐ค
Try the space: https://huggingface.co/spaces/fffiloni/lcm-img-variations
โ
We take the opportunity to combine moondream1 vision and LCM SDXL fast abilities to generate a variation from the subject of the image input.
All that thanks to gradio APIs ๐ค
Try the space: https://huggingface.co/spaces/fffiloni/lcm-img-variations
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
joaogante's
post with โค๏ธ
about 1 year ago
Post
Up to 3x faster LLM generation with no extra resources/requirements - ngram speculation has landed in ๐ค transformers! ๐๏ธ๐จ
All you need to do is to add
How does it work? ๐ค
Start with assisted generation, where a smaller model generates candidate sequences. The net result is a significant speedup if the model agrees with the candidate sequences! However, we do require a smaller model trained similarly ๐
The idea introduced (and implemented) by Apoorv Saxena consists of gathering the candidate sequences from the input text itself. If the latest generated ngram is in the input, use the continuation therein as a candidate! No smaller model is required while still achieving significant speedups ๐ฅ
In fact, the penalty of gathering and testing the candidates is so small that you should use this technique whenever possible!
Here is the code example that produces the outputs shown in the video: https://pastebin.com/bms6XtR4
Have fun ๐ค
All you need to do is to add
prompt_lookup_num_tokens=10
to your generate
call, and you'll get faster LLMs ๐ฅHow does it work? ๐ค
Start with assisted generation, where a smaller model generates candidate sequences. The net result is a significant speedup if the model agrees with the candidate sequences! However, we do require a smaller model trained similarly ๐
The idea introduced (and implemented) by Apoorv Saxena consists of gathering the candidate sequences from the input text itself. If the latest generated ngram is in the input, use the continuation therein as a candidate! No smaller model is required while still achieving significant speedups ๐ฅ
In fact, the penalty of gathering and testing the candidates is so small that you should use this technique whenever possible!
Here is the code example that produces the outputs shown in the video: https://pastebin.com/bms6XtR4
Have fun ๐ค
data:image/s3,"s3://crabby-images/93e1f/93e1f5d9033e2965593e439e59a9efd5f12e2d76" alt=""
reacted to
alvarobartt's
post with ๐คฏ
about 1 year ago
Post
๐จ Notux 8x7b was just released!
From Argilla, we recently fine-tuned Mixtral 8x7b Instruct from Mistral AI using DPO, and a binarized and curated version of UltraFeedback, to find out it outperforms every other MoE-based model on the Hub.
- argilla/notux-8x7b-v1
- argilla/ultrafeedback-binarized-preferences-cleaned
From Argilla, we recently fine-tuned Mixtral 8x7b Instruct from Mistral AI using DPO, and a binarized and curated version of UltraFeedback, to find out it outperforms every other MoE-based model on the Hub.
- argilla/notux-8x7b-v1
- argilla/ultrafeedback-binarized-preferences-cleaned