3 13 7

Jesse

jessepisel

jessepisel

AI & ML interests

computer vision, generative ai, agentic

Recent Activity

liked a model 6 days ago

perplexity-ai/r1-1776

liked a Space 11 days ago

AIEnergyScore/Leaderboard

reacted to fdaudens's post with ❤️ 11 days ago

⭐️ The AI Energy Score project just launched - this is a game-changer for making informed decisions about AI deployment. You can now see exactly how much energy your chosen model will consume, with a simple 5-star rating system. Think appliance energy labels, but for AI. Looking at transcription models on the leaderboard is fascinating: choosing between whisper-tiny or whisper-large-v3 can make a 7x difference. Real-time data on these tradeoffs changes everything. 166 models already evaluated across 10 different tasks, from text generation to image classification. The whole thing is public and you can submit your own models to test. Why this matters: - Teams can pick efficient models that still get the job done - Developers can optimize for energy use from day one - Organizations can finally predict their AI environmental impact If you're building with AI at any scale, definitely worth checking out. 👉 leaderboard: https://lnkd.in/esrSxetj 👉 blog post: https://lnkd.in/eFJvzHi8 Huge work led by @sasha with @bgamazay @yjernite @sarahooker @regisss @meg

View all activity

Organizations

jessepisel's activity

liked a model 6 days ago

perplexity-ai/r1-1776

Updated 5 days ago • 8.06k • 1.61k

liked a Space 11 days ago

AI Energy Score Leaderboard

🌟

Explore energy-efficient AI models by task

reacted to fdaudens's post with ❤️ 11 days ago

Post

2664

⭐️ The AI Energy Score project just launched - this is a game-changer for making informed decisions about AI deployment.

You can now see exactly how much energy your chosen model will consume, with a simple 5-star rating system. Think appliance energy labels, but for AI.

Looking at transcription models on the leaderboard is fascinating: choosing between whisper-tiny or whisper-large-v3 can make a 7x difference. Real-time data on these tradeoffs changes everything.

166 models already evaluated across 10 different tasks, from text generation to image classification. The whole thing is public and you can submit your own models to test.

Why this matters:
- Teams can pick efficient models that still get the job done
- Developers can optimize for energy use from day one
- Organizations can finally predict their AI environmental impact

If you're building with AI at any scale, definitely worth checking out.

👉 leaderboard: https://lnkd.in/esrSxetj
👉 blog post: https://lnkd.in/eFJvzHi8

Huge work led by @sasha with @bgamazay @yjernite @sarahooker @regisss @meg

1 reply

upvoted a paper 18 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 20 days ago • 190

upvoted a collection 25 days ago

Tulu 3 Models

Collection

All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 12 days ago • 91

upvoted an article 27 days ago

Article

Welcome to Inference Providers on the Hub 🔥

28 days ago

• 385

reacted to fdaudens's post with ❤️ 27 days ago

Post

8581

Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5M—nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. 🚀

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version — 1M downloads alone.

4 replies

upvoted an article 27 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

28 days ago

• 771

upvoted a paper 2 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

updated a model 2 months ago

thinkonward/challenges

Updated Dec 11, 2024

updated a Space 2 months ago

README

💻

ThinkOnward Organization

upvoted an article 3 months ago

Article

Use Models from the Hugging Face Hub in LM Studio

•

Nov 28, 2024

• 138

updated 2 models 3 months ago

thinkonward/section-seeker-large-16

Updated Dec 6, 2024

thinkonward/section-seeker-base-16

Updated Dec 6, 2024

updated a dataset 3 months ago

thinkonward/reflection-connection

Viewer • Updated Dec 4, 2024 • 556 • 18

New activity in thinkonward/reflection-connection 3 months ago

add test files

#2 opened 3 months ago by

jessepisel

add train dataset

#1 opened 3 months ago by

jessepisel

reacted to m-ric's post with 🔥 3 months ago

Post

1300

🤖 𝗔𝗱𝗼𝗯𝗲'𝘀 𝗰𝗼𝗱𝗲-𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗻𝗴 𝗮𝗴𝗲𝗻𝘁 𝗿𝗲𝗮𝗰𝗵𝗲𝘀 𝘁𝗵𝗲 𝘁𝗼𝗽 𝗼𝗳 𝗚𝗔𝗜𝗔 𝗹𝗲𝗮𝗱𝗲𝗿𝗯𝗼𝗮𝗿𝗱 - and their paper cites my work!

💡 Reminder: In short, Agentic systems are a vehicle in which you put your LLM to allow it access to the outside world.

➡️ The team of researchers at Adobe started from the idea that current agentic systems lack the ability to define their own tools. So they decided to make an agent that writes actions as code, thus allowing it to write python functions that can be re-used later as tools!

Here's what the LLM generations can look like with the proper prompt:

Thought: I need to access the excel file using a different method.
Action:

def access_excel_file(file_path)
	... # rest of the code (the agent does writes it, but I don't have room in this post)
	return rows

Then your system executes this and appends the observation to the agent's memory.

Why is this code formulation better than classical tool use formulation as JSON? The paper explains:

"Most existing work uses text or JSON as the representation of actions, which significantly lacks the two criteria mentioned earlier: generality and composability. In contrast, DynaSaur can utilize available actions or create new ones if necessary, using code as a unified representation. In principle, acting with code enables agents to solve any Turing-complete problem."

The idea of using code is not new: in fact, we do it in transformers.agents (thus the citation that I got). They implementation adds further refinements, like using RAG to retrieve relevant functions before generating an action, which increases performance further.

And they observe that code agents perform much better, reaching the top of GAIA leaderboard! 🥇

Go take a look, it's really clear and informative!

Paper added to my agents collection 👉 m-ric/agents-65ba776fbd9e29f771c07d4e

upvoted a paper 3 months ago

CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models

Paper • 2411.18613 • Published Nov 27, 2024 • 52

updated a model 3 months ago

thinkonward/geophysical-foundation-model

Updated Nov 26, 2024 • 2 • 5