Multimodal Classification Model (BM-v1)
This model combines text and image inputs to predict player moves from in-game screenshots for the popular 4X Civilization VI. In use, screenshot inputs are provided and text inputs generated using an LLM.
Model Details
- Developed by: BeakerStreet
- Model type: Multimodal Classification Model
- Language(s): English
- License: MIT
Uses
Predicts the likely moves a player will make from a complete sample space of all (observed) player moves, based on a provided screenshot and associated text. Can be fine-tuned to specifically predict types of move (scouting, build orders, settle/doesn't settle)
Direct Use
Predicts the likely moves a player will make, from a complete sample space of all player moves, based on a provided screenshot and associated text.
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The HF Inference API does not support audio-text-to-text models for tensorflow library.