Nicolay Rusnachenko's picture

Nicolay Rusnachenko

nicolay-r

AI & ML interests

Information Retrieval・Medical Multimodal NLP (🖼+📝) Research Fellow @BU_Research・software developer http://arekit.io・PhD in NLP

Recent Activity

View all activity

Organizations

None yet

nicolay-r's activity

posted an update about 17 hours ago
view post
Post
578
📢 If you're looking for translating massive dataset of JSON-lines / CSV data with various set of source fields, then the following update would be relevant. So far and experimenting with adapting language specific Sentiment Analysis model, got a change to reforge and relaese bulk-translate 0.25.2.
⭐️ https://github.com/nicolay-r/bulk-translate/releases/tag/0.25.2

The update has the following major features
- Supporting schemas: all the columns to be translated are now could be declared within the same prompt-style format. using json this automatically allows to map them onto output fields
- The related updates for shell execution mode: schema parameter is now available alongside with just a prompt usage before.

Benefit is that your output is invariant. You can extend and stack various translators with separated shell laucnhes.

Screenshot below is the application of the google-translate engine in manual batching mode.
🚀 Performance: 2.5 it / sec (in the case of a single field translation)

🌟 about bulk-translate: https://github.com/nicolay-r/bulk-translate
🌌 nlp-thirdgate: https://github.com/nicolay-r/nlp-thirdgate?tab=readme-ov-file
  • 1 reply
·
replied to ychen's post 1 day ago
view reply

Thanks! Any publicly available resources of such a synthetic texts that would lead to your observations?

reacted to ychen's post with 👍 1 day ago
view post
Post
1927
Here's some annoying keywords that 4o tends to use when responding to personal experiences with negative sentiments. Will be updated over time.

rough, tough, sound like, sounds like, frustrating, overwhelming
·
posted an update 6 days ago
view post
Post
2320
📢 For those who start to work with LLM streaming in web, here is a minimalistic example in JS for accessing server hosted by FastAPI via REST:
https://gist.github.com/nicolay-r/840425749cf6d3e397da3d329e894d59

The code above is a revised verison for accessing Replicate API posted earlier
https://huggingface.co/posts/nicolay-r/390307941200307

The key difference from Replicate API:
- using only POST for passing a body with parameters and fetching the reader.
posted an update 7 days ago
view post
Post
2393
📢 For those who consider a quick and inplace annotation of entities in JSON / CSV tabular data, I got a good news. So far releasing the latest version of the bulk-ner which does these things for you:
🌟 https://github.com/nicolay-r/bulk-ner/releases/tag/0.25.2

bulk-ner is a no-string wrapper over NER service using popular frameworks like DeepPavlov, Spacy, Flair.

What's new? The latest 0.25.2 version has the following key features:
🔧 Fixed: 🐛 the output ignores other input content in input #31
🔥 Schemas support: you can annotate various coulmns by combining them as you wish and map onto the other output colums (see 📸 below) #28

Below is the screenshot on how you can quick start of using it with Spacy models.

🌌 List of other providers @ nlp-thirdgate:
https://github.com/nicolay-r/nlp-thirdgate/tree/master/ner
reacted to csabakecskemeti's post with 👍 8 days ago
reacted to fffiloni's post with 🔥 8 days ago
reacted to tianchez's post with 🚀 8 days ago
view post
Post
3833
Introducing VLM-R1!

GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks?

The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task).

https://github.com/om-ai-lab/VLM-R1
reacted to benhaotang's post with 🚀 8 days ago
view post
Post
2362
Try out my updated implementation of forked OpenDeepResearcher(link below) as an OpenAI compatible endpoint, but with full control, can be deployed completely free with Gemini api or completely locally with ollama, or pay-as-you-go in BYOK format, the AI agents will think dynamically based on the difficulties of given research, compatible with any OpenAI compatible configurable clients(Msty, Chatbox, even vscode AI Toolkit playground).

If you don't want to pay OpenAI $200 to use or want to take control of your deep research, check out here:
👉 https://github.com/benhaotang/OpenDeepResearcher-via-searxng

**Personal take**

Based on my testing against Perplexity's and Gemini's implementation with some Physics domain questions, mine is comparable and very competent at finding even the most rare articles or methods.

Also a funny benchmark of mine to test all these searching models, is to trouble shot a WSL2 hanging issue I experienced last year, with prompt:

> wsl2 in windows hangs in background with high vmmem cpu usage once in a while, especially after hibernation, no error logs captured in linux, also unable to shutdown in powershell, provide solutions

the final solution that took me a day last year to find is to patch the kernel with some steps documented in carlfriedrich's repo and wait Microsoft to solve it(it is buried deep in wsl issues). Out of the three, only my Deep Research agent has found this solution, Perplexity and Gemini just focus on other force restart or memory management methods. I am very impressed with how it has this kind of obscure and scarce trouble shooting ability.

**Limitations**

Some caveats to be done later:
- Multi-turn conversation is not yet supported, so no follow-up questions
- System message is only extra writing instructions, don't affect on search
- Small local model may have trouble citing source reliably, I am working on a fix to fact check all citation claims
  • 1 reply
·
reacted to m-ric's post with 🔥 9 days ago
view post
Post
4693
"𝟮𝟬𝟮𝟱 𝘄𝗶𝗹𝗹 𝗯𝗲 𝘁𝗵𝗲 𝘆𝗲𝗮𝗿 𝗼𝗳 𝗔𝗜 𝗮𝗴𝗲𝗻𝘁𝘀": this statement has often been made, here are numbers to support it.

I've plotted the progress of AI agents on GAIA test set, and it seems they're headed to catch up with the human baseline in early 2026.

And that progress is still driven mostly by the improvement of base LLMs: progress would be even faster with fine-tuned agentic models.
posted an update 9 days ago
view post
Post
1490
📢 If you're around Replicate AI models and wish to use them in streaming mode via JS, then this snippet might be a quick way to experiment with streaming API usage:
https://gist.github.com/nicolay-r/86fc212086c0955d541244253ec0564b

Why it matters? The original docs has:
🟢 No the relate support for JS rather only Python/HTTP and NodeJS by using the replicate package.
🟢 Mixture of NodeJS and bash curl snippets:
https://replicate.com/docs/topics/predictions/streaming

Special thanks to the reated template for accessing APIs of other vendors like Claude / OpenAI by Simon Willson in the following post:
https://til.simonwillison.net/llms/streaming-llm-apis

Default model: meta-llama/Meta-Llama-3-70B

PS: I am happy and open for your comments related to this solution
reacted to mmhamdy's post with 👀 12 days ago
view post
Post
2946
⛓ Evaluating Long Context #2: SCROLLS and ZeroSCROLLS

In this series of posts about tracing the history of long context evaluation, we started with Long Range Arena (LRA). Introduced in 2020, Long Range Arens (LRA) is one of the earliest benchmarks designed to tackle the challenge of long context evaluation. But it wasn't introduced to evaluate LLMs, but rather the transformer architecture in general.

📜 The SCROLLS benchmark, introduced in 2022, addresses this gap in NLP/LLM research. SCROLLS challenges models with tasks that require reasoning over extended sequences (according to 2022 standards). So, what does it offer?

1️⃣ Long Text Focus: SCROLLS (unlike LRA) focus mainly on text and contain inputs with thousands of words, testing models' ability to synthesize information across lengthy documents.
2️⃣ Diverse Tasks: Includes summarization, question answering, and natural language inference across domains like literature, science, and business.
3️⃣ Unified Format: All datasets are available in a text-to-text format, facilitating easy evaluation and comparison of models.

Building on SCROLLS, ZeroSCROLLS takes long text evaluation to the next level by focusing on zero-shot learning. Other features include:

1️⃣ New Tasks: Introduces tasks like sentiment aggregation and sorting book chapter summaries.
2️⃣ Leaderboard: A live leaderboard encourages continuous improvement and competition among researchers.

💡 What are some other landmark benchmarks in the history of long context evaluation? Feel free to share your thoughts and suggestions in the comments.

- SCROLLS Paper: SCROLLS: Standardized CompaRison Over Long Language Sequences (2201.03533)
- ZeroSCROLLS Paper: ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding (2305.14196)
reacted to sequelbox's post with 🧠👀 12 days ago
reacted to ginipick's post with 🚀 12 days ago
view post
Post
5469
Time Stream ⏳🚀

Time Stream is a groundbreaking AI tool that transforms your text into a mesmerizing video journey from the past to the future. With this innovative technology, your ideas evolve over time, visualized through a dynamic image strip and a fluid video narrative. Imagine typing a simple prompt and watching as your words transform into vivid scenes that capture every moment of change—like a time machine for creativity! 🎥✨

Key Features: • Text-to-Video Transformation: Enter any text, and Time Stream converts it into a compelling video that travels through time, turning your ideas into a visual story. 📽️
• Dynamic Image Strip: Alongside the video, a vibrant image strip is created, showcasing each stage of the transformation so you can see every detail of the evolution. 📸
• Customizable Settings: Adjust parameters such as strength, guidance scale, and more to fine-tune your video’s appearance and ensure it perfectly matches your creative vision. ⚙️
• User-Friendly Interface: With a modern and sleek design, Time Stream is incredibly easy to use. Its intuitive layout lets you focus on your creativity without any technical hurdles. 🖥️🌟

Time Stream is perfect for artists, storytellers, designers, and anyone who loves to see their ideas come to life in new and exciting ways. Whether you’re reflecting on the past, celebrating the present, or dreaming about the future, Time Stream turns your narrative into a vivid, ever-changing masterpiece. Dive in and let your imagination soar as you journey through time, one image at a time! 🚀🔥

ginipick/Time-Stream
reacted to s-emanuilov's post with 🔥 12 days ago
view post
Post
5151
Tutorial 💥 Training a non-English reasoning model with GRPO and Unsloth

I wanted to share my experiment with training reasoning models in languages other than English/Chinese.

Using Llama 3.1 8B as base, GRPO trainer from trl, and Unsloth optimizations, I got a working prototype in Bulgarian after ~5 hours on an L40S GPU. The approach should work for any language where the base model has some pre-training coverage.

Full code and tutorial here: https://unfoldai.com/reasoning-in-a-non-english-language/

The model itself: s-emanuilov/LLMBG-Llama-3.1-8B-BG-Reasoning-v0.1

I hope this helps anyone looking to build reasoning models in their language.
·
reacted to schuler's post with 👍 12 days ago
view post
Post
7216
📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm
  • 2 replies
·
reacted to davidberenstein1957's post with 👀 12 days ago
reacted to lewtun's post with ❤️ 12 days ago
view post
Post
4468
Introducing OpenR1-Math-220k!

open-r1/OpenR1-Math-220k

The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch 💪

What’s new compared to existing reasoning datasets?

♾ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.

🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.

📀 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.

⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)

📊 We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

🔎 Read our blog post for all the nitty gritty details: https://huggingface.co/blog/open-r1/update-2
reacted to ImranzamanML's post with 👍 12 days ago
view post
Post
3131
Hugging Face just launched the AI Agents Course – a free journey from beginner to expert in AI agents!

- Learn AI Agent fundamentals, use cases and frameworks
- Use top libraries like LangChain & LlamaIndex
- Compete in challenges & earn a certificate
- Hands-on projects & real-world applications

https://huggingface.co/learn/agents-course/unit0/introduction

You can join for a live Q&A on Feb 12 at 5PM CET to learn more about the course here

https://www.youtube.com/live/PopqUt3MGyQ