Fix exporting streaming zipformer models as non-streaming models#2095
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughIn ChangesShort-sequence edge scaling fix
Estimated code review effort🎯 2 (Simple) | ⏱️ ~5 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request modifies the forward method in scaling_converter.py to handle sequence lengths shorter than the kernel size by introducing an if/else conditional block. The reviewer suggests refactoring this to a unified padding and slicing approach using max to eliminate the conditional block, which ensures the code remains ONNX-export friendly and avoids potential issues with ONNX runtimes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if seq_len < self.kernel_size: | ||
| left_edge = left_edge[:, :seq_len] | ||
| right_edge = right_edge[:, -seq_len:] | ||
| else: | ||
| t = seq_len - self.kernel_size | ||
| channels = left_edge.shape[0] | ||
| pad = torch.zeros( | ||
| channels, t, device=left_edge.device, dtype=left_edge.dtype | ||
| ) | ||
| left_edge = torch.cat((left_edge, pad), dim=-1) | ||
| right_edge = torch.cat((pad, right_edge), dim=-1) |
There was a problem hiding this comment.
The class NonStreamingChunkCausalDepthwiseConv1d is specifically designed to be ONNX-export friendly by avoiding conditionals where possible (as noted in its docstring).\n\nBy using a unified padding and slicing approach with max(0, seq_len - self.kernel_size), we can completely eliminate the if/else control flow block. This avoids generating ONNX If nodes, which can be problematic or inefficient for certain ONNX runtimes (e.g., TensorRT or mobile runtimes), while keeping the code much more concise.
pad_len = max(0, seq_len - self.kernel_size)\n pad = torch.zeros(\n left_edge.shape[0], pad_len, device=left_edge.device, dtype=left_edge.dtype\n )\n left_edge = torch.cat((left_edge, pad), dim=-1)[:, :seq_len]\n right_edge = torch.cat((pad, right_edge), dim=-1)[:, -seq_len:]|
@csukuangfj wonderful, many thanks! |
See errors reported at
#2086 (comment)
cc @nshmyrev
Summary by CodeRabbit