Enhance demo.py with audio saving and settings loading#89
Conversation
|
Thanks. This is a very large PR. Many of its features are very useful, such as:
However, some of these changes may not be very suitable for the project's default demo interface:
If you have time to make the changes according to the above suggestions, I would be happy to merge this PR. By the way, in the latest code, the generate function will return a list of np.ndarray with shape (T,) at 24 kHz. This is incompatible with the current code. |
da92c05 to
9e21256
Compare
Added options for output directory and format in CLI.
|
I think I made the corrections that you asked for. I did add stereo output to the code because in Wan2GP when I tried to use a mono output to make a video, I got an error, so I had to use Audacity to make it into a stereo track to make it work. I hope this is what you were asking for? Thank you. |
@zhu-han Added automatic voice transcribed words into the "Reference Text Box" when a custom voice is added in "Voice Clone", unless the argument "--no-asr" is used, then like normal, you will have to manually add the words spoken in your custom voice file. Added download models into "ckpts" folder in OmniVoice root. Added optional "--inbrowser" argument to allow gradio to automatically open in browser after successful launch. Added functionality to auto-save generated audio into "outputs" folder in root. Added "omnivoice_settings.JSON" file to automatically save & load last used settings. Added option to save as .wav or .mp3 file. Moved generate button to below Status box, so you don't have to scroll all the way to the bottom to click generate. Added "saved to outputs/name & time" with total time it took to complete the generation. Updated argument handling and UI components for better usability. Thank you.
This is in my portable version I mentioned in "Torch 2.8 is known to have memory leak problems" #9