Can Zipformer's 130ms of padding be reduced further? #2089

uni-rini-sharon · 2026-06-17T02:46:43Z

uni-rini-sharon
Jun 17, 2026

The 13 frame padding for convolution and right context in zipformer is in effect adding 130ms of latency to the overall streaming ASR processing. Is there any way we can reduce this ? Has anyone tried training with causality or other methods to bring this down?

We see that the 13 frames constitute 7 frames for convolutional kernel functioning and 6 for zipformer's internal right context.

@csukuangfj @pkufool

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can Zipformer's 130ms of padding be reduced further? #2089

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Can Zipformer's 130ms of padding be reduced further? #2089

Uh oh!

Uh oh!

uni-rini-sharon Jun 17, 2026

Replies: 0 comments

uni-rini-sharon
Jun 17, 2026