3 7 3

Vardaan Pahuja

vardaan123

https://vardaan123.github.io/

AI & ML interests

Knowledge Graph Reasoning, Graph Representation Learning, Multimodal KGs

Recent Activity

commented on a paper 5 days ago

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents

authored a paper 5 days ago

Diversifying Joint Vision-Language Tokenization Learning

authored a paper 5 days ago

A Systematic Investigation of KB-Text Embedding Alignment at Scale

View all activity

Organizations

vardaan123's activity

commented a paper 5 days ago

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents

Paper • 2502.11357 • Published 7 days ago • 9 •

authored 5 papers 5 days ago

Diversifying Joint Vision-Language Tokenization Learning

Paper • 2306.03421 • Published Jun 6, 2023 • 1

A Systematic Investigation of KB-Text Embedding Alignment at Scale

Paper • 2106.01586 • Published Jun 3, 2021

Bringing Back the Context: Camera Trap Species Identification as Link Prediction on Multimodal Knowledge Graphs

Paper • 2401.00608 • Published Dec 31, 2023 • 1

A Retrieve-and-Read Framework for Knowledge Graph Link Prediction

Paper • 2212.09724 • Published Dec 19, 2022 • 1

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents

Paper • 2502.11357 • Published 7 days ago • 9

upvoted a paper 6 days ago

Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents

Paper • 2502.11357 • Published 7 days ago • 9

New activity in huggingface/HuggingDiscussions 6 days ago

[FEEDBACK] Daily Papers

118

#32 opened 9 months ago by

kramp

New activity in meta-llama/Llama-3.2-11B-Vision-Instruct 3 months ago

Flash Attention Support

#41 opened 5 months ago by

rameshch

liked a model 5 months ago

mistralai/Pixtral-12B-2409

Image-Text-to-Text • Updated Dec 26, 2024 • • 608

upvoted 2 papers 5 months ago

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Paper • 2410.05080 • Published Oct 7, 2024 • 21

Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

Paper • 2410.05243 • Published Oct 7, 2024 • 19

upvoted a paper 11 months ago

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8, 2024 • 82

liked a model 11 months ago

timm/vit_base_patch16_clip_384.laion2b_ft_in1k

Image Classification • Updated Jan 21 • 1.08k • 5

upvoted 2 papers 12 months ago

LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error

Paper • 2403.04746 • Published Mar 7, 2024 • 24

Learning and Leveraging World Models in Visual Representation Learning

Paper • 2403.00504 • Published Mar 1, 2024 • 32

upvoted a paper about 1 year ago

A Retrieve-and-Read Framework for Knowledge Graph Link Prediction

Paper • 2212.09724 • Published Dec 19, 2022 • 1

updated 2 Spaces about 1 year ago

COSMO

🦀

COSMO Demo

🦀

liked a Space about 1 year ago

COSMO Demo

🦀