Bui Van Hop

hllj

https://hllj.github.io/

hopbui3
hllj

AI & ML interests

Computer Vision, Deep Learning, NLP

Recent Activity

upvoted an article 11 days ago

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

updated a dataset 12 days ago

hllj/medscape

updated a dataset 12 days ago

hllj/quesmed

View all activity

Organizations

hllj's activity

upvoted an article 11 days ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

•

19 days ago

• 11

updated 2 datasets 12 days ago

hllj/medscape

Viewer • Updated 12 days ago • 508 • 57 • 1

hllj/quesmed

Viewer • Updated 12 days ago • 4.83k • 31

published a dataset 12 days ago

hllj/quesmed

Viewer • Updated 12 days ago • 4.83k • 31

liked a dataset 13 days ago

mehrankazemi/ReMI

Viewer • Updated Oct 22, 2024 • 2.63k • 210 • 11

published a dataset 14 days ago

hllj/medscape

Viewer • Updated 12 days ago • 508 • 57 • 1

liked 2 datasets 5 months ago

ByteDance/MTVQA

Viewer • Updated May 30, 2024 • 8.79k • 291 • 25

Infi-MM/InfiMM-WebMath-40B

Viewer • Updated Sep 24, 2024 • 22.8M • 655 • 60

updated a collection 6 months ago

Code LLMs

Collection

1 item • Updated Sep 10, 2024

updated a dataset 6 months ago

hllj/synthetic-text-embedding

Viewer • Updated Aug 24, 2024 • 10k • 407

liked 2 datasets 7 months ago

mlfoundations/VisIT-Bench

Viewer • Updated Jan 23, 2024 • 574 • 983 • 15

argilla/multi-modal-vlm-visit-bench

Viewer • Updated Aug 7, 2024 • 575 • 138 • 4

upvoted a collection 7 months ago

Research projects on top of vLLM

Collection

Papers cited in https://blog.vllm.ai/2024/07/25/lfai-perf.html • 6 items • Updated Jul 29, 2024 • 12

upvoted an article 7 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

•

Jul 29, 2024

• 282

upvoted a collection 7 months ago

Vista

Collection

A family of Vietnamese Vision Language Model • 4 items • Updated Jun 30, 2024 • 2

liked 2 datasets 7 months ago

minhquan6203/ViTextVQA

Updated Jul 22, 2024 • 16 • 12

5CD-AI/Viet-Doc-VQA-II

Viewer • Updated Aug 25, 2024 • 64.8k • 87 • 30

liked a model 7 months ago

tuanio/ft-moellava-qwen1.5-1.8b-vista-lora-2ep

Text Classification • Updated Jul 26, 2024 • 10 • 2

reacted to kenshinn's post with ❤️ 7 months ago

Post

2031

Sparse MoE (SMoE) has an unavoidable drawback: the performance of SMoE heavily relies on the choice of hyper-parameters, such as the number of activated experts per token (top-k) and the number of experts.

Also, identifying the optimal hyper-parameter without a sufficient number of ablation studies is challenging. As the size of the models continues to grow, this limitation could result in a significant waste of computational resources, and in turn, could hinder the efficiency of training MoE-based models in practice.

(READ MORE ↓↓↓) Now, our DynMoE addresses these challenges! 🙌 DynMoE incorporates:
(1) a novel gating method that enables each token to automatically determine the number of experts to activate.

(2) An adaptive process automatically adjusts the number of experts during training. Extensive numerical results across Vision, Language, and Vision-Language tasks demonstrate the effectiveness of our approach to achieve competitive performance compared to GMoE for vision and language tasks, and MoE-LLaVA for vision-language tasks, while maintaining efficiency by activating fewer parameters.

Our code is available at https://github.com/LINs-lab/DynMoE, also see the checkpoints at LINs-lab/dynmoe-family-665ed5a331a7e84463cab01a