Yi3852 commited on
Commit
11a857b
·
verified ·
1 Parent(s): 192296e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -3
README.md CHANGED
@@ -1,3 +1,42 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: audio-text-to-text
4
+ language:
5
+ - en
6
+ - zh
7
+ base_model:
8
+ - Yi3852/MuFun-Base
9
+ ---
10
+ a prompt generator for the [ACE-Step](https://huggingface.co/ACE-Step/ACE-Step-v1-3.5B) music generation model, fintuned from the MuFun model proposed in [Advancing the Foundation Model for Music Understanding](https://arxiv.org/abs/2508.01178)
11
+
12
+ ## Usage
13
+ some audio processing packages like mutagen, torchaudio are needed to be installed
14
+ ```python
15
+ from transformers import AutoTokenizer, AutoModelForCausalLM
16
+ hf_path = 'Yi3852/MuFun-ACEStep'
17
+ tokenizer = AutoTokenizer.from_pretrained(hf_path, use_fast=False)
18
+ device='cuda'
19
+ model = AutoModelForCausalLM.from_pretrained(hf_path, trust_remote_code=True, torch_dtype="bfloat16")
20
+ model.to(device)
21
+
22
+ aud="/path/to/your/song.mp3.wav"
23
+ inp='<audio>\nDeconstruct this song, listing its tags and lyrics. Directly output a JSON object with prompt and lyrics fields, without any additional explanations or text.'
24
+ res=model.chat(prompt=inp, audio_files=aud, segs=None, tokenizer=tokenizer)
25
+ print(res)
26
+ # { "prompt": "110 bpm, soulful, electric, synthesizer, catchy, keyboard, guitar",
27
+ # "lyrics": "[verse] \nNeon lights, they flicker bright, \nCity hums in dead of night. \nRhythms pulse through concrete veins, \nLost in echoes of refrains. \n\nBassline grooves in my chest, \nHeartbeats match the city's vest. \nElectric whispers fill the air, \nSynthesized dreams everywhere. \n\n[chorus] \nTurn it up and let it flow, \nFeel the fire, let it grow. \nIn this rhythm, we belong, \nHere tonight, sing our song. \n\n[verse] \nGuitar strings, they start to weep, \nWake the soul from silent sleep. \nEvery note a story told, \nIn this night, we're bold and gold. \n\nVoices blend in harmony, \nLost in pure cacophony. \nTimeless echoes, timeless cries, \nSoulful shouts beneath the skies. \n\n[bridge] \nKeyboard dances on the keys, \nMelodies on evening breeze. \nCatch the tune and hold it tight, \nIn this moment, we take flight. \n\n[chorus] \nTurn it up and let it flow, \nFeel the fire, let it grow. \nIn this rhythm, we belong, \nHere tonight, sing our song. "
28
+ # }
29
+ ```
30
+
31
+ ## Citation
32
+
33
+ ```bibtex
34
+ @misc{jiang2025advancingfoundationmodelmusic,
35
+ title={Advancing the Foundation Model for Music Understanding},
36
+ author={Yi Jiang and Wei Wang and Xianwen Guo and Huiyun Liu and Hanrui Wang and Youri Xu and Haoqi Gu and Zhongqian Xie and Chuanjiang Luo},
37
+ year={2025},
38
+ eprint={2508.01178},
39
+ archivePrefix={arXiv},
40
+ primaryClass={cs.SD},
41
+ url={https://arxiv.org/abs/2508.01178},
42
+ }