โญ๏ธ The AI Energy Score project just launched - this is a game-changer for making informed decisions about AI deployment.
You can now see exactly how much energy your chosen model will consume, with a simple 5-star rating system. Think appliance energy labels, but for AI.
Looking at transcription models on the leaderboard is fascinating: choosing between whisper-tiny or whisper-large-v3 can make a 7x difference. Real-time data on these tradeoffs changes everything.
166 models already evaluated across 10 different tasks, from text generation to image classification. The whole thing is public and you can submit your own models to test.
Why this matters: - Teams can pick efficient models that still get the job done - Developers can optimize for energy use from day one - Organizations can finally predict their AI environmental impact
If you're building with AI at any scale, definitely worth checking out.
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:
- Original release: 8 models, 540K downloads. Just the beginning...
- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5Mโnearly 5X the originals.
The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.
When you empower builders, innovation explodes. For everyone. ๐
The most popular community model? @bartowski's DeepSeek-R1-Distill-Qwen-32B-GGUF version โ 1M downloads alone.
๐ค ๐๐ฑ๐ผ๐ฏ๐ฒ'๐ ๐ฐ๐ผ๐ฑ๐ฒ-๐ด๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐๐ถ๐ป๐ด ๐ฎ๐ด๐ฒ๐ป๐ ๐ฟ๐ฒ๐ฎ๐ฐ๐ต๐ฒ๐ ๐๐ต๐ฒ ๐๐ผ๐ฝ ๐ผ๐ณ ๐๐๐๐ ๐น๐ฒ๐ฎ๐ฑ๐ฒ๐ฟ๐ฏ๐ผ๐ฎ๐ฟ๐ฑ - and their paper cites my work!
๐ก Reminder:ย In short, Agentic systems are a vehicle in which you put your LLM to allow it access to the outside world.
โก๏ธ The team of researchers at Adobe started from the idea that current agentic systems lack the ability to define their own tools. So they decided to make an agent that writes actions as code, thus allowing it to write python functions that can be re-used later as tools!
Here's what the LLM generations can look like with the proper prompt:
Thought: I need to access the excel file using a different method. Action:
defaccess_excel_file(file_path)
... # rest of the code (the agent does writes it, but I don't have room in this post)return rows
Then your system executes this and appends the observation to the agent's memory.
Why is this code formulation better than classical tool use formulation as JSON? The paper explains:
"Most existing work uses text or JSON as the representation of actions, which significantly lacks the two criteria mentioned earlier: generality and composability. In contrast, DynaSaur can utilize available actions or create new ones if necessary, using code as a unified representation. In principle, acting with code enables agents to solve any Turing-complete problem."
The idea of using code is not new: in fact, we do it in transformers.agents (thus the citation that I got). They implementation adds further refinements, like using RAG to retrieve relevant functions before generating an action, which increases performance further.
And they observe that code agents perform much better, reaching the top of GAIA leaderboard! ๐ฅ
Go take a look, it's really clear and informative!