Analyze video to describe actions and transcribe audio
interact with videos !
Generate images from text descriptions
Generate images from text prompts