Michal Zebrowski

M1cler

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

None yet

M1cler's activity

reacted to as-cle-bert's post with ๐Ÿคฏ 9 days ago
view post
Post
1365
๐’๐œ๐ข๐๐ž๐ฐ๐ฌ๐๐จ๐ญ - ๐‘๐ž๐ฉ๐จ๐ซ๐ญ ๐๐š๐ข๐ฅ๐ฒ ๐’๐œ๐ข๐ž๐ง๐œ๐ž ๐ง๐ž๐ฐ๐ฌ ๐จ๐ง ๐๐ฅ๐ฎ๐ž๐’๐ค๐ฒ

GitHub ๐Ÿ‘‰ https://github.com/AstraBert/SciNewsBot
BlueSky ๐Ÿ‘‰ https://bsky.app/profile/sci-news-bot.bsky.social

Hi there HF Community!๐Ÿค—
I just created a very simple AI-powered bot that shares fact-checked news about Science, Environment, Energy and Technology on BlueSky :)

The bot takes news from Google News, filters out the sources that are not represented in the Media Bias Fact Check database, and then evaluates the reliability of the source based on the MBFC metrics. After that, it creates a catchy headline for the article and publishes the post on BlueSky๐Ÿ“ฐ

The cool thing? SciNewsBot is open-source and is cheap to maintain, as it is based on mistralai/Mistral-Small-24B-Instruct-2501 (via Mistral API). You can reproduce it locally, spinning it up on your machine, and even launch it on cloud through a comfy Docker setup๐Ÿ‹

Have fun and spread Science!โœจ
reacted to m-ric's post with โค๏ธ 9 days ago
view post
Post
2673
๐—š๐—ฟ๐—ฒ๐—ฎ๐˜ ๐—ณ๐—ฒ๐—ฎ๐˜๐˜‚๐—ฟ๐—ฒ ๐—ฎ๐—น๐—ฒ๐—ฟ๐˜: you can now share agents to the Hub! ๐Ÿฅณ๐Ÿฅณ

And any agent pushed to Hub get a cool Space interface to directly chat with it.

This was a real technical challenge: for instance, serializing tools to export them meant that you needed to get all the source code for a tool, verify that it was standalone (not relying on external variables), and gathering all the packages required to make it run.

Go try it out! ๐Ÿ‘‰ https://github.com/huggingface/smolagents
  • 2 replies
ยท
reacted to AdinaY's post with โž• 9 days ago
view post
Post
2529
Ovis2 ๐Ÿ”ฅ a multimodal LLM released by Alibaba AIDC team.
AIDC-AI/ovis2-67ab36c7e497429034874464
โœจ1B/2B/4B/8B/16B/34B
โœจStrong CoT for deeper problem solving
โœจMultilingual OCR โ€“ Expanded beyond English & Chinese, with better data extraction
reacted to etemiz's post with ๐Ÿ”ฅ๐Ÿค—๐Ÿคฏ๐Ÿš€๐Ÿ‘€๐Ÿง ๐Ÿ˜” 9 days ago
view post
Post
3801
Some things are simple
replied to mrzjy's post 9 days ago
replied to clem's post 4 months ago
reacted to clem's post with โค๏ธ 4 months ago
view post
Post
4468
This is no Woodstock AI but will be fun nonetheless haha. Iโ€™ll be hosting a live workshop with team members next week about the Enterprise Hugging Face hub.

1,000 spots available first-come first serve with some surprises during the stream!

You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM
ยท
reacted to m-ric's post with โค๏ธ 6 months ago
view post
Post
920
๐—”๐—œ๐Ÿฎ๐Ÿญ ๐—ถ๐˜๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ๐˜€ ๐˜„๐—ถ๐˜๐—ต ๐—ป๐—ฒ๐˜„ ๐—๐—ฎ๐—บ๐—ฏ๐—ฎ ๐Ÿญ.๐Ÿฑ ๐—ฟ๐—ฒ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ: ๐—ก๐—ฒ๐˜„ ๐˜€๐˜๐—ฎ๐—ป๐—ฑ๐—ฎ๐—ฟ๐—ฑ ๐—ณ๐—ผ๐—ฟ ๐—น๐—ผ๐—ป๐—ด-๐—ฐ๐—ผ๐—ป๐˜๐—ฒ๐˜…๐˜ ๐˜‚๐˜€๐—ฒ-๐—ฐ๐—ฎ๐˜€๐—ฒ๐˜€!๐Ÿ…

@ai21labs used a different architecture to beat the status-quo Transformers models: Jamba architecture combines classic Transformers layers with the new Mamba layers, for which the complexity is a linear (instead of quadratic) function of the context length.

What does this imply?

โžก๏ธ Jamba models are much more efficient for long contexts: faster (up to 2.5x faster for long context), takes less memory, and also performs better to recall everything in the prompt.

That means itโ€™s a new go-to model for RAG or agentic applications!

And the performance is not too shabby: Jamba 1.5 models are comparable in perf to similar-sized Llama-3.1 models! The largest model even outperforms Llama-3.1 405B on Arena-Hard.

โœŒ๏ธ Comes in 2 sizes: Mini (12B active/52B) and Large (94B active/399B)
๐Ÿ“ Both deliver 256k context length, for low memory: Jamba-1.5 mini fits 140k context length on one single A100.
โš™๏ธ New quanttization method: Experts Int8 quantizes only the weights parts of the MoE layers, which account for 85% of weights
๐Ÿค– Natively supports JSON format generation & function calling.
๐Ÿ”“ Permissive license *if your org makes <$50M revenue*

Available on the Hub ๐Ÿ‘‰ ai21labs/jamba-15-66c44befa474a917fcf55251
Read their release blog post ๐Ÿ‘‰ https://www.ai21.com/blog/announcing-jamba-model-family
  • 2 replies
ยท