Mahadi Hassan
Mahadih534
AI & ML interests
Genarative AI | LLMs | NLP | Data Analytics | VLMs | Computer Vision | AI AGENT
Organizations
Video-Reasoning_Datasets
SOTA Small Language Models
-
Qwen/Qwen2.5-0.5B-Instruct
Text Generation • 0.5B • Updated • 2.41M • 418 -
HuggingFaceTB/SmolLM3-3B
Text Generation • 3B • Updated • 98.7k • • 848 -
HuggingFaceTB/SmolLM2-135M-Instruct
Text Generation • 0.1B • Updated • 329k • 273 -
Qwen/Qwen3-4B-Thinking-2507
Text Generation • 4B • Updated • 494k • • 502
Medical Visual Question Answering (VQA) Datasets
Multilingual Chain of Thought (COT) Datasets for Fine-tuning
Bangla Instruction Dataset for Instruction Tuning
Bangla-TTS-Datasets
LLM Reasoning Dataset
Bangla_info-based-dataset
Datasets for Fine-tuning LLMs
Bangla Datasets for LLMs Finetuning
This collection contains all Bengali datasets which are more effective and useful for LLM fine-tuning or instruction-tuning or other various NLP tasks
Customer_Support_LLM_Dataset_for_Finetunig
-
urvog/llama2_transcripts_healthcare_callcenter
Viewer • Updated • 1k • 219 • 5 -
aciborowska/customers-complaints-test
Viewer • Updated • 3k • 24 -
NebulaByte/E-Commerce_Customer_Support_Conversations
Viewer • Updated • 1k • 295 • 46 -
bitext/Bitext-customer-support-llm-chatbot-training-dataset
Viewer • Updated • 26.9k • 2.56k • 144
Medical Computer Vision Datasets
Text to Audio Dataset
Arabic Transformer Models
-
asafaya/bert-base-arabic
Fill-Mask • 0.1B • Updated • 9.82k • • 40 -
akhooli/gpt2-small-arabic
Text Generation • 0.1B • Updated • 158 • 18 -
MohamedRashad/Arabic-Orpo-Llama-3-8B-Instruct
Text Generation • 8B • Updated • 3.77k • • 16 -
malhajar/Mistral-7B-v0.1-arabic
Text Generation • 7B • Updated • 24 • 9
Bangla NLP Datasets
OCR-Reasoning_Datasets
Medical-Reasoning-Datasets
Bangla Visual Question Answering (VQA) Datasets
Visual Question Answering (VQA) Dataset for VLLMs
Instruction Datasets for LLMs Fine-Tuning
Bangla Chain of Thought (COT) Datasets
VLM fine-tuning instruction dataset
Bangla_vision_dataset
Medical_Dataset_for_LLM
Arabic Datasets for Fine-tuning
Medical LLMs Finetuning Datasets
Tech_Dataset_for_LLM_Finetunig
OCR
Small Language Model for Fine-tuning
Wav2Vec Datasets
Computer Vision Datasets (Multi-Domain)
Visual-Reasoning-Datasets
OCR-Reasoning_Datasets
Video-Reasoning_Datasets
Medical-Reasoning-Datasets
SOTA Small Language Models
-
Qwen/Qwen2.5-0.5B-Instruct
Text Generation • 0.5B • Updated • 2.41M • 418 -
HuggingFaceTB/SmolLM3-3B
Text Generation • 3B • Updated • 98.7k • • 848 -
HuggingFaceTB/SmolLM2-135M-Instruct
Text Generation • 0.1B • Updated • 329k • 273 -
Qwen/Qwen3-4B-Thinking-2507
Text Generation • 4B • Updated • 494k • • 502
Bangla Visual Question Answering (VQA) Datasets
Medical Visual Question Answering (VQA) Datasets
Visual Question Answering (VQA) Dataset for VLLMs
Multilingual Chain of Thought (COT) Datasets for Fine-tuning
Instruction Datasets for LLMs Fine-Tuning
Bangla Instruction Dataset for Instruction Tuning
Bangla Chain of Thought (COT) Datasets
Bangla-TTS-Datasets
VLM fine-tuning instruction dataset
LLM Reasoning Dataset
Bangla_vision_dataset
Bangla_info-based-dataset
Medical_Dataset_for_LLM
Datasets for Fine-tuning LLMs
Arabic Datasets for Fine-tuning
Bangla Datasets for LLMs Finetuning
This collection contains all Bengali datasets which are more effective and useful for LLM fine-tuning or instruction-tuning or other various NLP tasks
Medical LLMs Finetuning Datasets
Customer_Support_LLM_Dataset_for_Finetunig
-
urvog/llama2_transcripts_healthcare_callcenter
Viewer • Updated • 1k • 219 • 5 -
aciborowska/customers-complaints-test
Viewer • Updated • 3k • 24 -
NebulaByte/E-Commerce_Customer_Support_Conversations
Viewer • Updated • 1k • 295 • 46 -
bitext/Bitext-customer-support-llm-chatbot-training-dataset
Viewer • Updated • 26.9k • 2.56k • 144
Tech_Dataset_for_LLM_Finetunig
Medical Computer Vision Datasets
OCR
Text to Audio Dataset
Small Language Model for Fine-tuning
Arabic Transformer Models
-
asafaya/bert-base-arabic
Fill-Mask • 0.1B • Updated • 9.82k • • 40 -
akhooli/gpt2-small-arabic
Text Generation • 0.1B • Updated • 158 • 18 -
MohamedRashad/Arabic-Orpo-Llama-3-8B-Instruct
Text Generation • 8B • Updated • 3.77k • • 16 -
malhajar/Mistral-7B-v0.1-arabic
Text Generation • 7B • Updated • 24 • 9
Wav2Vec Datasets
Bangla NLP Datasets
Computer Vision Datasets (Multi-Domain)