Audio-Driven / Lip-Synced Video Generation Support (as available on magi.sand.ai)

Hi Team,

First of all, thank you for releasing the MAGI-1 model and the accompanying resources — the work is really impressive.

I had a question / feature request regarding audio-driven video generation.

On the magi.sand.ai website, there is an option to generate videos that are lip-synced to an input audio file. However, in the open-source implementation here, I don't see any documented way to:

provide an audio file as input,

generate a lip-synced video using both a reference video and audio, or

align the generated frames to spoken audio the same way the website demonstrates.

Could you please clarify:

Is audio-conditioned generation / lip-sync supported in the open-source release?

If yes, can you point to the script, parameters, or example usage?

If not currently available, is this feature planned for a future release?

This capability would be extremely valuable for creating realistic talking videos directly from audio + reference visuals, similar to what the website already provides.

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Audio-Driven / Lip-Synced Video Generation Support (as available on magi.sand.ai) #113

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Audio-Driven / Lip-Synced Video Generation Support (as available on magi.sand.ai) #113

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions