Transform and identify speech with MMS
Generate detailed script for podcast or lecture from text input