Skip to content

perf: optimize JSON serialization for OCI manifests#388

Draft
sajayantony wants to merge 4 commits into
oras-project:mainfrom
sajayantony:perf/json-serialization-optimization
Draft

perf: optimize JSON serialization for OCI manifests#388
sajayantony wants to merge 4 commits into
oras-project:mainfrom
sajayantony:perf/json-serialization-optimization

Conversation

@sajayantony

Copy link
Copy Markdown
Collaborator

Description

Performance optimizations for the Go-compatible JSON serialization introduced in #348. Now that #366 has merged comprehensive serialization tests, we have high confidence these changes preserve correctness.

Changes

Fast-path escape detection (NeedsEscaping())

  • Scans for escapable characters before invoking the full escape pass
  • Most OCI strings (digests, media types, annotation keys) are clean ASCII and skip escaping entirely

Allocation reduction

  • Replace StringBuilder with ArrayPool<char> in EscapeJsonString
  • Direct hex-digit lookup via ReadOnlySpan<byte> instead of string interpolation
  • Pre-encode UTF-8 keys into a single pooled buffer for non-ASCII dictionary sort

ASCII fast-path sort (OciDictionaryConverter)

  • When all annotation keys are ASCII (the common case), use string.CompareOrdinal — equivalent to UTF-8 bytewise order with zero extra allocations
  • Non-ASCII slow path encodes keys once (O(N)) instead of per-comparison (O(N log N))

Zero-copy deserialization for buffered streams (ManifestStore)

  • Use MemoryStream.TryGetBuffer() instead of .ToArray() to avoid buffer copy
  • Add Deserialize<T>(ReadOnlySpan<byte>) overload for span-based deserialization
  • Replace DeserializeAsync with sync Deserialize for already-buffered data — async adds ~50-80% overhead for no benefit on buffered streams

Benchmark project (benchmarks/)

  • Perf tests for serialize, sync deserialize, and async deserialize
  • Covers manifests with 5 and 50 annotations, index with 10 manifests
  • See benchmarks/README.md for usage

Thread Safety (Multi-Tenant Server Use)

All changes are safe for concurrent multi-tenant use:

Component Status Notes
NeedsEscaping() Safe Pure function, no state
EscapeJsonString() ArrayPool Safe Rent/return per call in try/finally
SortByUtf8Key() ArrayPool Safe Rent/return per call in try/finally
HexDigits ReadOnlySpan Safe Immutable literal
ManifestStore sync deser Safe Instance method, no shared state

No static dictionaries, caches, or shared mutable collections introduced.

Benchmark Results

Scenario Baseline (STJ) w/ Optimization
Serialize 5 annotations 11.5us ~30us
Serialize 50 annotations 73.5us ~140us
Deserialize sync 5 ann 18.4us 10.9us
Deserialize sync 50 ann 95.7us 48.3us

Note: Serialization is slower due to Go-compatible escaping (required for digest compatibility). Deserialization benefits from the custom JSON options with sync path.

Known Issues / Follow-ups

  1. Benchmark uses reflection to access internal OciJsonSerializer — acceptable for perf testing but will break if internals are renamed. Consider making benchmark types InternalsVisibleTo in a follow-up.
  2. Sync deserialization assumption — ProcessReferrersAndPushIndex and IndexReferrersForDelete assume the stream is always buffered. This is true at all current call sites but should be documented as an invariant.

Testing

All 502 existing tests pass. The serialization test suite (#366) provides comprehensive coverage for the escaping and sorting behavior.


Generated with AI assistance

sajayantony and others added 4 commits May 27, 2026 00:45
- Add NeedsEscaping() fast-path to skip escape pass for clean strings
- Replace StringBuilder with ArrayPool<char> in EscapeJsonString
- Use direct hex-digit lookup table instead of string interpolation
- ASCII fast-path sort in OciDictionaryConverter (ordinal == UTF-8)
- Pre-encode UTF-8 keys once for non-ASCII slow path sort
- Replace DeserializeAsync with sync Deserialize for buffered streams

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Sajay Antony <sajaya@microsoft.com>
BenchmarkDotNet-based performance tests for OCI manifest
serialization and deserialization paths. Covers serialize,
sync deserialize, and async deserialize with varying
annotation counts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Sajay Antony <sajaya@microsoft.com>
Replace ms.ToArray() with MemoryStream.TryGetBuffer() to avoid
allocating a copy of the buffer. Add Deserialize<T>(ReadOnlySpan<byte>)
overload to support span-based deserialization from buffer segments.

This reduces GC pressure in multi-tenant server scenarios where
many manifests are deserialized concurrently.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Sajay Antony <sajaya@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Sajay Antony <sajaya@microsoft.com>
@codecov

codecov Bot commented May 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.74611% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.85%. Comparing base (b08920c) to head (b7dd260).

Files with missing lines Patch % Lines
.../OrasProject.Oras/Registry/Remote/ManifestStore.cs 73.80% 6 Missing and 5 partials ⚠️
...rasProject.Oras/Serialization/OciJsonSerializer.cs 96.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #388      +/-   ##
==========================================
- Coverage   92.03%   91.85%   -0.19%     
==========================================
  Files          67       67              
  Lines        2938     3069     +131     
  Branches      380      398      +18     
==========================================
+ Hits         2704     2819     +115     
- Misses        139      152      +13     
- Partials       95       98       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant