fengerhu's picture
Add link to paper (#1)
4f0425a verified
metadata
library_name: transformers
license: apache-2.0
pipeline_tag: image-text-to-text
tags:
  - multimodal
  - gui

MobiMind-Grounder-3B Model

This is the Grounder Model of MobiAgent with 3B parameters, capable of low-level UI element grounding in GUI agent task execution, as presented in the paper MobiAgent: A Systematic Framework for Customizable Mobile Agents.

About MobiAgent

MobiAgent is a powerful mobile agent system including:

  • An agent model family: MobiMind
  • An agent acceleration framework: AgentRR
  • An agent benchmark: MobiFlow

System Architecture:

Evaluation Results

## Usage

Deploy model inference service with vLLM:

vllm serve IPADS-SAI/MobiMind-Grounder-3B

For more usage details, e.g., execute GUI tasks with ADB or our Android App, please refer to our repo!