Skip to content

PoC: Port AVIF decoder from dav1d to rav1d-safe#2853

Draft
Shnatsel wants to merge 2 commits intoimage-rs:mainfrom
Shnatsel:rav1d-safe-poc
Draft

PoC: Port AVIF decoder from dav1d to rav1d-safe#2853
Shnatsel wants to merge 2 commits intoimage-rs:mainfrom
Shnatsel:rav1d-safe-poc

Conversation

@Shnatsel
Copy link
Member

This will never be merged due to rav1d-safe being under the restrictive AGPL license.

This PR exists purely to evaluate the performance and correctness characteristics of rav1d-safe, and decide if trying to upstream it is worth it.

Since this never going into production I had Claude Opus 4.6 do the porting, as opposed to #2849 which is entirely handwritten. This may have introduced issues, so failures of this PR should not automatically be assumed to be failures of the rav1d-safe crate.

Shnatsel and others added 2 commits March 14, 2026 21:43
Replace the dav1d FFI-based AV1 decoder with rav1d-safe, a pure safe
Rust AV1 decoder. The safe managed API simplifies the code significantly:
decoder init uses a single decode() call instead of send_data/get_picture
retry loops, and 16-bit plane access is type-safe (&[u16]) eliminating
all the transmute/reshape helpers for FFI data.

Co-Authored-By: Claude <noreply@anthropic.com>
rav1d-safe's PlaneView::as_slice() returns the full backing buffer which
may be larger than stride * height. The YUV conversion code validates
that plane slices are exactly stride * height. Trim the plane slices to
the expected size before passing them to YuvPlanarImage.

Co-Authored-By: Claude <noreply@anthropic.com>
@Shnatsel
Copy link
Member Author

Shnatsel commented Mar 14, 2026

The handful of samples I've thrown at it decodes fine.

The default configuration (all safe code) is about 2x slower than rav1d with handwritten assembly, as advertised. This is not an entirely fair comparison since rav1d has more AVX-512 codepaths than rav1d-safe does.

Here's the profile, measured with wondermagick: https://share.firefox.dev/47ESGSX
I see a hot scalar fallback in there accounting for 11% of the time, it's probably a bug and likely could be eliminated.

unchecked feature improves runtime by about 8% while partial_asm feature fails to build; meanwhile rav1d builds fine. This looks like a bug in rav1d-safe build script to me.

When profiling rav1d I noticed it uses multi-threading while rav1d-safe does not. Here's the rav1d profile: https://share.firefox.dev/46XRJ89 but when forced onto a single thread it rav1d performs about the same since only a small portion is multi-threaded: https://share.firefox.dev/4lzSpqu

So between the presence of the hot scalar fallback, missing AVX-512 codepaths and non-functional inline assembly for parts that couldn't be easily converted to Rust, I think there's plenty of performance still to be gained, so the gap could be closed significantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant