Skip to content

Audio-Driven / Lip-Synced Video Generation Support (as available on magi.sand.ai) #113

Description

@gamechanger-creator

Hi Team,

First of all, thank you for releasing the MAGI-1 model and the accompanying resources — the work is really impressive.

I had a question / feature request regarding audio-driven video generation.

On the magi.sand.ai website, there is an option to generate videos that are lip-synced to an input audio file. However, in the open-source implementation here, I don't see any documented way to:

provide an audio file as input,

generate a lip-synced video using both a reference video and audio, or

align the generated frames to spoken audio the same way the website demonstrates.

Could you please clarify:

Is audio-conditioned generation / lip-sync supported in the open-source release?

If yes, can you point to the script, parameters, or example usage?

If not currently available, is this feature planned for a future release?

This capability would be extremely valuable for creating realistic talking videos directly from audio + reference visuals, similar to what the website already provides.

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions