feat(aws_durable_execution): persist trace context across suspend/resume#17773
feat(aws_durable_execution): persist trace context across suspend/resume#17773joeyzhao2018 wants to merge 54 commits into
Conversation
Performance SLOsComparing candidate joey/cross-invocation-tracecontext-propagation (db67ae1) with baseline joey/apm-ai-toolkit/aws-durable-execution-sdk-python (92f0e53) 📈 Performance Regressions (2 suites)📈 iastaspects - 118/118✅ add_aspectTime: ✅ 104.846µs (SLO: <130.000µs 📉 -19.3%) vs baseline: +4.3% Memory: ✅ 43.988MB (SLO: <46.000MB -4.4%) vs baseline: +4.8% ✅ add_inplace_aspectTime: ✅ 101.578µs (SLO: <130.000µs 📉 -21.9%) vs baseline: -0.5% Memory: ✅ 43.921MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ add_inplace_noaspectTime: ✅ 28.128µs (SLO: <40.000µs 📉 -29.7%) vs baseline: -2.0% Memory: ✅ 43.872MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% ✅ add_noaspectTime: ✅ 49.152µs (SLO: <70.000µs 📉 -29.8%) vs baseline: +0.3% Memory: ✅ 43.857MB (SLO: <46.000MB -4.7%) vs baseline: +4.6% ✅ bytearray_aspectTime: ✅ 265.249µs (SLO: <400.000µs 📉 -33.7%) vs baseline: -0.2% Memory: ✅ 43.845MB (SLO: <46.000MB -4.7%) vs baseline: +4.8% ✅ bytearray_extend_aspectTime: ✅ 654.784µs (SLO: <800.000µs 📉 -18.2%) vs baseline: +0.2% Memory: ✅ 43.870MB (SLO: <46.000MB -4.6%) vs baseline: +4.5% ✅ bytearray_extend_noaspectTime: ✅ 273.422µs (SLO: <400.000µs 📉 -31.6%) vs baseline: +0.7% Memory: ✅ 43.967MB (SLO: <46.000MB -4.4%) vs baseline: +5.1% ✅ bytearray_noaspectTime: ✅ 148.822µs (SLO: <300.000µs 📉 -50.4%) vs baseline: +0.9% Memory: ✅ 43.975MB (SLO: <46.000MB -4.4%) vs baseline: +5.5% ✅ bytes_aspectTime: ✅ 231.096µs (SLO: <300.000µs 📉 -23.0%) vs baseline: +0.2% Memory: ✅ 43.962MB (SLO: <46.000MB -4.4%) vs baseline: +5.2% ✅ bytes_noaspectTime: ✅ 139.112µs (SLO: <200.000µs 📉 -30.4%) vs baseline: +0.4% Memory: ✅ 43.791MB (SLO: <46.000MB -4.8%) vs baseline: +4.6% ✅ bytesio_aspectTime: ✅ 3.824ms (SLO: <5.000ms 📉 -23.5%) vs baseline: +0.4% Memory: ✅ 43.870MB (SLO: <46.000MB -4.6%) vs baseline: +4.6% ✅ bytesio_noaspectTime: ✅ 322.401µs (SLO: <420.000µs 📉 -23.2%) vs baseline: +0.4% Memory: ✅ 43.852MB (SLO: <46.000MB -4.7%) vs baseline: +4.6% ✅ capitalize_aspectTime: ✅ 89.959µs (SLO: <300.000µs 📉 -70.0%) vs baseline: +0.8% Memory: ✅ 43.999MB (SLO: <46.000MB -4.3%) vs baseline: +5.0% ✅ capitalize_noaspectTime: ✅ 274.466µs (SLO: <300.000µs -8.5%) vs baseline: +8.2% Memory: ✅ 43.934MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ casefold_aspectTime: ✅ 89.525µs (SLO: <500.000µs 📉 -82.1%) vs baseline: ~same Memory: ✅ 43.930MB (SLO: <46.000MB -4.5%) vs baseline: +4.9% ✅ casefold_noaspectTime: ✅ 312.184µs (SLO: <500.000µs 📉 -37.6%) vs baseline: +1.1% Memory: ✅ 43.962MB (SLO: <46.000MB -4.4%) vs baseline: +4.8% ✅ decode_aspectTime: ✅ 87.271µs (SLO: <100.000µs 📉 -12.7%) vs baseline: ~same Memory: ✅ 43.909MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ decode_noaspectTime: ✅ 155.960µs (SLO: <210.000µs 📉 -25.7%) vs baseline: -1.0% Memory: ✅ 43.868MB (SLO: <46.000MB -4.6%) vs baseline: +4.7% ✅ encode_aspectTime: ✅ 84.877µs (SLO: <200.000µs 📉 -57.6%) vs baseline: +0.6% Memory: ✅ 43.902MB (SLO: <46.000MB -4.6%) vs baseline: +4.8% ✅ encode_noaspectTime: ✅ 145.266µs (SLO: <200.000µs 📉 -27.4%) vs baseline: +0.1% Memory: ✅ 43.876MB (SLO: <46.000MB -4.6%) vs baseline: +4.7% ✅ format_aspectTime: ✅ 14.649ms (SLO: <19.200ms 📉 -23.7%) vs baseline: +0.6% Memory: ✅ 43.933MB (SLO: <46.000MB -4.5%) vs baseline: +4.9% ✅ format_map_aspectTime: ✅ 16.368ms (SLO: <21.500ms 📉 -23.9%) vs baseline: ~same Memory: ✅ 43.951MB (SLO: <46.000MB -4.5%) vs baseline: +5.1% ✅ format_map_noaspectTime: ✅ 361.466µs (SLO: <500.000µs 📉 -27.7%) vs baseline: ~same Memory: ✅ 43.947MB (SLO: <46.000MB -4.5%) vs baseline: +4.9% ✅ format_noaspectTime: ✅ 310.122µs (SLO: <500.000µs 📉 -38.0%) vs baseline: ~same Memory: ✅ 43.907MB (SLO: <46.000MB -4.6%) vs baseline: +5.0% ✅ index_aspectTime: ✅ 127.993µs (SLO: <300.000µs 📉 -57.3%) vs baseline: +5.0% Memory: ✅ 43.998MB (SLO: <46.000MB -4.4%) vs baseline: +5.3% ✅ index_noaspectTime: ✅ 41.123µs (SLO: <300.000µs 📉 -86.3%) vs baseline: +0.6% Memory: ✅ 43.919MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ join_aspectTime: ✅ 214.200µs (SLO: <300.000µs 📉 -28.6%) vs baseline: -0.3% Memory: ✅ 43.880MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% ✅ join_noaspectTime: ✅ 141.669µs (SLO: <300.000µs 📉 -52.8%) vs baseline: -0.9% Memory: ✅ 43.927MB (SLO: <46.000MB -4.5%) vs baseline: +5.1% ✅ ljust_aspectTime: ✅ 587.609µs (SLO: <700.000µs 📉 -16.1%) vs baseline: 📈 +16.7% Memory: ✅ 43.919MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ ljust_noaspectTime: ✅ 260.659µs (SLO: <300.000µs 📉 -13.1%) vs baseline: +0.2% Memory: ✅ 43.877MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% ✅ lower_aspectTime: ✅ 309.436µs (SLO: <500.000µs 📉 -38.1%) vs baseline: -1.3% Memory: ✅ 43.974MB (SLO: <46.000MB -4.4%) vs baseline: +5.2% ✅ lower_noaspectTime: ✅ 240.155µs (SLO: <300.000µs 📉 -19.9%) vs baseline: +0.5% Memory: ✅ 43.887MB (SLO: <46.000MB -4.6%) vs baseline: +5.0% ✅ lstrip_aspectTime: ✅ 0.279ms (SLO: <3.000ms 📉 -90.7%) vs baseline: +1.0% Memory: ✅ 43.857MB (SLO: <46.000MB -4.7%) vs baseline: +4.7% ✅ lstrip_noaspectTime: ✅ 0.179ms (SLO: <3.000ms 📉 -94.0%) vs baseline: +1.1% Memory: ✅ 43.947MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ modulo_aspectTime: ✅ 14.239ms (SLO: <18.750ms 📉 -24.1%) vs baseline: -0.1% Memory: ✅ 43.953MB (SLO: <46.000MB -4.4%) vs baseline: +4.9% ✅ modulo_aspect_for_bytearray_bytearrayTime: ✅ 14.755ms (SLO: <19.350ms 📉 -23.7%) vs baseline: +0.2% Memory: ✅ 43.943MB (SLO: <46.000MB -4.5%) vs baseline: +4.6% ✅ modulo_aspect_for_bytesTime: ✅ 14.393ms (SLO: <18.900ms 📉 -23.8%) vs baseline: -0.3% Memory: ✅ 43.906MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% ✅ modulo_aspect_for_bytes_bytearrayTime: ✅ 14.544ms (SLO: <19.150ms 📉 -24.1%) vs baseline: -0.2% Memory: ✅ 44.052MB (SLO: <46.000MB -4.2%) vs baseline: +5.1% ✅ modulo_noaspectTime: ✅ 0.366ms (SLO: <3.000ms 📉 -87.8%) vs baseline: +1.2% Memory: ✅ 43.877MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% ✅ replace_aspectTime: ✅ 18.330ms (SLO: <24.000ms 📉 -23.6%) vs baseline: +0.2% Memory: ✅ 43.940MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ replace_noaspectTime: ✅ 288.694µs (SLO: <400.000µs 📉 -27.8%) vs baseline: +0.4% Memory: ✅ 43.943MB (SLO: <46.000MB -4.5%) vs baseline: +4.9% ✅ repr_aspectTime: ✅ 329.026µs (SLO: <420.000µs 📉 -21.7%) vs baseline: +1.0% Memory: ✅ 43.837MB (SLO: <46.000MB -4.7%) vs baseline: +4.9% ✅ repr_noaspectTime: ✅ 46.298µs (SLO: <90.000µs 📉 -48.6%) vs baseline: -1.2% Memory: ✅ 44.006MB (SLO: <46.000MB -4.3%) vs baseline: +5.4% ✅ rstrip_aspectTime: ✅ 390.488µs (SLO: <500.000µs 📉 -21.9%) vs baseline: +1.3% Memory: ✅ 43.966MB (SLO: <46.000MB -4.4%) vs baseline: +5.2% ✅ rstrip_noaspectTime: ✅ 184.620µs (SLO: <300.000µs 📉 -38.5%) vs baseline: -0.9% Memory: ✅ 43.830MB (SLO: <46.000MB -4.7%) vs baseline: +4.3% ✅ slice_aspectTime: ✅ 180.609µs (SLO: <300.000µs 📉 -39.8%) vs baseline: -1.2% Memory: ✅ 43.929MB (SLO: <46.000MB -4.5%) vs baseline: +5.0% ✅ slice_noaspectTime: ✅ 54.646µs (SLO: <90.000µs 📉 -39.3%) vs baseline: +1.6% Memory: ✅ 43.916MB (SLO: <46.000MB -4.5%) vs baseline: +4.7% ✅ stringio_aspectTime: ✅ 4.598ms (SLO: <5.000ms -8.0%) vs baseline: 📈 +18.0% Memory: ✅ 43.994MB (SLO: <46.000MB -4.4%) vs baseline: +5.0% ✅ stringio_noaspectTime: ✅ 359.247µs (SLO: <500.000µs 📉 -28.2%) vs baseline: +0.4% Memory: ✅ 43.924MB (SLO: <46.000MB -4.5%) vs baseline: +4.9% ✅ strip_aspectTime: ✅ 277.018µs (SLO: <350.000µs 📉 -20.9%) vs baseline: +1.4% Memory: ✅ 43.956MB (SLO: <46.000MB -4.4%) vs baseline: +5.1% ✅ strip_noaspectTime: ✅ 177.441µs (SLO: <240.000µs 📉 -26.1%) vs baseline: -0.6% Memory: ✅ 43.920MB (SLO: <46.000MB -4.5%) vs baseline: +4.7% ✅ swapcase_aspectTime: ✅ 348.183µs (SLO: <500.000µs 📉 -30.4%) vs baseline: +0.8% Memory: ✅ 43.929MB (SLO: <46.000MB -4.5%) vs baseline: +4.8% ✅ swapcase_noaspectTime: ✅ 272.516µs (SLO: <400.000µs 📉 -31.9%) vs baseline: -0.7% Memory: ✅ 44.000MB (SLO: <46.000MB -4.3%) vs baseline: +5.2% ✅ title_aspectTime: ✅ 333.461µs (SLO: <500.000µs 📉 -33.3%) vs baseline: -1.1% Memory: ✅ 43.927MB (SLO: <46.000MB -4.5%) vs baseline: +5.1% ✅ title_noaspectTime: ✅ 269.269µs (SLO: <400.000µs 📉 -32.7%) vs baseline: +3.4% Memory: ✅ 43.952MB (SLO: <46.000MB -4.5%) vs baseline: +4.9% ✅ translate_aspectTime: ✅ 513.050µs (SLO: <700.000µs 📉 -26.7%) vs baseline: +0.1% Memory: ✅ 43.900MB (SLO: <46.000MB -4.6%) vs baseline: +4.8% ✅ translate_noaspectTime: ✅ 431.340µs (SLO: <500.000µs 📉 -13.7%) vs baseline: -0.6% Memory: ✅ 43.829MB (SLO: <46.000MB -4.7%) vs baseline: +4.9% ✅ upper_aspectTime: ✅ 311.331µs (SLO: <500.000µs 📉 -37.7%) vs baseline: +0.3% Memory: ✅ 43.931MB (SLO: <46.000MB -4.5%) vs baseline: +4.7% ✅ upper_noaspectTime: ✅ 239.055µs (SLO: <400.000µs 📉 -40.2%) vs baseline: +0.7% Memory: ✅ 43.882MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% 📈 iastaspectsospath - 24/24✅ ospathbasename_aspectTime: ✅ 522.536µs (SLO: <700.000µs 📉 -25.4%) vs baseline: 📈 +23.2% Memory: ✅ 43.855MB (SLO: <46.000MB -4.7%) vs baseline: +4.8% ✅ ospathbasename_noaspectTime: ✅ 429.429µs (SLO: <700.000µs 📉 -38.7%) vs baseline: +0.9% Memory: ✅ 43.866MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% ✅ ospathjoin_aspectTime: ✅ 630.061µs (SLO: <700.000µs -10.0%) vs baseline: +0.2% Memory: ✅ 43.949MB (SLO: <46.000MB -4.5%) vs baseline: +4.9% ✅ ospathjoin_noaspectTime: ✅ 635.077µs (SLO: <700.000µs -9.3%) vs baseline: -0.3% Memory: ✅ 43.847MB (SLO: <46.000MB -4.7%) vs baseline: +4.7% ✅ ospathnormcase_aspectTime: ✅ 350.203µs (SLO: <700.000µs 📉 -50.0%) vs baseline: -0.5% Memory: ✅ 43.880MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% ✅ ospathnormcase_noaspectTime: ✅ 356.197µs (SLO: <700.000µs 📉 -49.1%) vs baseline: -0.6% Memory: ✅ 43.967MB (SLO: <46.000MB -4.4%) vs baseline: +5.0% ✅ ospathsplit_aspectTime: ✅ 480.753µs (SLO: <700.000µs 📉 -31.3%) vs baseline: -0.3% Memory: ✅ 43.859MB (SLO: <46.000MB -4.7%) vs baseline: +4.6% ✅ ospathsplit_noaspectTime: ✅ 492.526µs (SLO: <700.000µs 📉 -29.6%) vs baseline: +0.1% Memory: ✅ 43.782MB (SLO: <46.000MB -4.8%) vs baseline: +4.5% ✅ ospathsplitdrive_aspectTime: ✅ 369.938µs (SLO: <700.000µs 📉 -47.2%) vs baseline: -0.3% Memory: ✅ 43.899MB (SLO: <46.000MB -4.6%) vs baseline: +4.9% ✅ ospathsplitdrive_noaspectTime: ✅ 73.230µs (SLO: <700.000µs 📉 -89.5%) vs baseline: +0.3% Memory: ✅ 43.955MB (SLO: <46.000MB -4.4%) vs baseline: +5.0% ✅ ospathsplitext_aspectTime: ✅ 458.394µs (SLO: <700.000µs 📉 -34.5%) vs baseline: -0.3% Memory: ✅ 43.915MB (SLO: <46.000MB -4.5%) vs baseline: +4.9% ✅ ospathsplitext_noaspectTime: ✅ 464.313µs (SLO: <700.000µs 📉 -33.7%) vs baseline: +0.3% Memory: ✅ 43.907MB (SLO: <46.000MB -4.5%) vs baseline: +5.1% 🟡 Near SLO Breach (7 suites)🟡 djangosimple - 28/28✅ appsecTime: ✅ 19.631ms (SLO: <22.300ms 📉 -12.0%) vs baseline: ~same Memory: ✅ 71.612MB (SLO: <73.500MB -2.6%) vs baseline: +5.0% ✅ exception-replay-enabledTime: ✅ 1.371ms (SLO: <1.450ms -5.5%) vs baseline: ~same Memory: ✅ 69.837MB (SLO: <71.500MB -2.3%) vs baseline: +4.9% ✅ iastTime: ✅ 19.644ms (SLO: <22.250ms 📉 -11.7%) vs baseline: -0.8% Memory: ✅ 71.631MB (SLO: <75.000MB -4.5%) vs baseline: +5.2% ✅ profilerTime: ✅ 15.201ms (SLO: <16.550ms -8.2%) vs baseline: ~same Memory: ✅ 60.391MB (SLO: <61.000MB 🟡 -1.0%) vs baseline: +4.9% ✅ resource-renamingTime: ✅ 19.558ms (SLO: <21.750ms 📉 -10.1%) vs baseline: +0.3% Memory: ✅ 71.602MB (SLO: <73.500MB -2.6%) vs baseline: +4.9% ✅ span-code-originTime: ✅ 19.986ms (SLO: <28.200ms 📉 -29.1%) vs baseline: +0.2% Memory: ✅ 71.576MB (SLO: <75.000MB -4.6%) vs baseline: +4.8% ✅ tracerTime: ✅ 19.616ms (SLO: <21.750ms -9.8%) vs baseline: -0.6% Memory: ✅ 71.329MB (SLO: <75.000MB -4.9%) vs baseline: +4.4% ✅ tracer-and-profilerTime: ✅ 21.035ms (SLO: <23.500ms 📉 -10.5%) vs baseline: +0.6% Memory: ✅ 73.482MB (SLO: <75.000MB -2.0%) vs baseline: +4.7% ✅ tracer-dont-create-db-spansTime: ✅ 19.694ms (SLO: <21.500ms -8.4%) vs baseline: ~same Memory: ✅ 71.513MB (SLO: <75.000MB -4.6%) vs baseline: +4.8% ✅ tracer-minimalTime: ✅ 17.793ms (SLO: <18.500ms -3.8%) vs baseline: -1.2% Memory: ✅ 71.523MB (SLO: <75.000MB -4.6%) vs baseline: +4.9% ✅ tracer-no-cachesTime: ✅ 18.806ms (SLO: <19.650ms -4.3%) vs baseline: ~same Memory: ✅ 71.457MB (SLO: <75.000MB -4.7%) vs baseline: +4.9% ✅ tracer-no-databasesTime: ✅ 20.550ms (SLO: <21.100ms -2.6%) vs baseline: -0.8% Memory: ✅ 71.643MB (SLO: <75.000MB -4.5%) vs baseline: +5.2% ✅ tracer-no-middlewareTime: ✅ 19.350ms (SLO: <21.500ms -10.0%) vs baseline: -0.3% Memory: ✅ 71.596MB (SLO: <75.000MB -4.5%) vs baseline: +5.2% ✅ tracer-no-templatesTime: ✅ 19.548ms (SLO: <22.000ms 📉 -11.1%) vs baseline: +1.0% Memory: ✅ 71.722MB (SLO: <73.500MB -2.4%) vs baseline: +5.2% 🟡 iastpropagation - 8/8✅ no-propagationTime: ✅ 48.726µs (SLO: <60.000µs 📉 -18.8%) vs baseline: -0.5% Memory: ✅ 40.855MB (SLO: <42.000MB -2.7%) vs baseline: +3.9% ✅ propagation_enabledTime: ✅ 138.970µs (SLO: <190.000µs 📉 -26.9%) vs baseline: +0.3% Memory: ✅ 41.229MB (SLO: <42.000MB 🟡 -1.8%) vs baseline: +5.5% ✅ propagation_enabled_100Time: ✅ 1.572ms (SLO: <2.300ms 📉 -31.7%) vs baseline: +0.4% Memory: ✅ 41.012MB (SLO: <42.000MB -2.4%) vs baseline: +4.8% ✅ propagation_enabled_1000Time: ✅ 29.349ms (SLO: <34.550ms 📉 -15.1%) vs baseline: +0.3% Memory: ✅ 40.835MB (SLO: <42.000MB -2.8%) vs baseline: +4.5% 🟡 otelspan - 22/22✅ add-eventTime: ✅ 41.627ms (SLO: <47.150ms 📉 -11.7%) vs baseline: ~same Memory: ✅ 41.472MB (SLO: <47.000MB 📉 -11.8%) vs baseline: +4.3% ✅ add-metricsTime: ✅ 234.535ms (SLO: <344.800ms 📉 -32.0%) vs baseline: ~same Memory: ✅ 45.622MB (SLO: <47.500MB -4.0%) vs baseline: +5.1% ✅ add-tagsTime: ✅ 264.274ms (SLO: <330.000ms 📉 -19.9%) vs baseline: -0.5% Memory: ✅ 45.536MB (SLO: <47.500MB -4.1%) vs baseline: +4.7% ✅ get-contextTime: ✅ 80.953ms (SLO: <92.350ms 📉 -12.3%) vs baseline: ~same Memory: ✅ 41.288MB (SLO: <46.500MB 📉 -11.2%) vs baseline: +5.3% ✅ is-recordingTime: ✅ 37.977ms (SLO: <44.500ms 📉 -14.7%) vs baseline: +0.4% Memory: ✅ 41.117MB (SLO: <47.500MB 📉 -13.4%) vs baseline: +5.2% ✅ record-exceptionTime: ✅ 62.680ms (SLO: <67.650ms -7.3%) vs baseline: -0.1% Memory: ✅ 41.855MB (SLO: <47.000MB 📉 -10.9%) vs baseline: +5.1% ✅ set-statusTime: ✅ 43.722ms (SLO: <50.400ms 📉 -13.2%) vs baseline: +0.6% Memory: ✅ 40.958MB (SLO: <47.000MB 📉 -12.9%) vs baseline: +4.9% ✅ startTime: ✅ 38.863ms (SLO: <44.500ms 📉 -12.7%) vs baseline: +4.3% Memory: ✅ 40.987MB (SLO: <47.000MB 📉 -12.8%) vs baseline: +4.9% ✅ start-finishTime: ✅ 89.959ms (SLO: <92.000ms -2.2%) vs baseline: +0.6% Memory: ✅ 38.830MB (SLO: <46.500MB 📉 -16.5%) vs baseline: +4.9% ✅ start-finish-telemetryTime: ✅ 91.263ms (SLO: <93.000ms 🟡 -1.9%) vs baseline: -0.4% Memory: ✅ 38.692MB (SLO: <46.500MB 📉 -16.8%) vs baseline: +4.7% ✅ update-nameTime: ✅ 39.189ms (SLO: <45.150ms 📉 -13.2%) vs baseline: +0.5% Memory: ✅ 41.038MB (SLO: <47.000MB 📉 -12.7%) vs baseline: +5.1% 🟡 packagesupdateimporteddependencies - 24/24 (1 unstable)✅ import_manyTime: ✅ 169.027µs (SLO: <170.000µs 🟡 -0.6%) vs baseline: +0.4% Memory: ✅ 41.459MB (SLO: <46.000MB -9.9%) vs baseline: +5.2% ✅ import_many_cachedTime: ✅ 131.168µs (SLO: <170.000µs 📉 -22.8%) vs baseline: -1.1% Memory: ✅ 41.347MB (SLO: <46.000MB 📉 -10.1%) vs baseline: +5.0% ✅ import_many_stdlibTime: ✅ 1.257ms (SLO: <1.750ms 📉 -28.2%) vs baseline: +0.6% Memory: ✅ 41.222MB (SLO: <46.000MB 📉 -10.4%) vs baseline: +4.9%
|
de57fad to
f873a4f
Compare
Codeowners resolved as |
3eddaeb to
f33d7fa
Compare
6c14d36 to
ca81f78
Compare
f33d7fa to
9a25ef3
Compare
|
e0f7a28 to
fed5e48
Compare
Introduce ``ddtrace/contrib/internal/aws_durable_execution_sdk_python/trace_checkpoint.py``
which appends a synthetic ``_datadog_{N}`` STEP operation to the durable
execution log on every ``SuspendExecution``. The payload is a JSON dict of
the propagation headers for the active trace, with the per-span volatile
fields (``x-datadog-parent-id``, ``traceparent``'s parent segment) rewritten
to point at the durable-execution root span — either the grandparent of the
current ``aws.durable.execute`` span (first invocation) or the parent id
already stored in the latest prior checkpoint (replays).
Diffing the new headers against the stored payload of the highest-N existing
``_datadog_*`` operation suppresses redundant writes; only ``x-datadog-parent-id``
and the ``dd=p:`` entry of ``tracestate`` are stripped before comparison so
sampling priority, decision-maker, origin, and propagation tags still trigger
a fresh save when they change.
Wired into ``_traced_durable_execution`` so a single checkpoint is written per
invocation, only on the suspend path (workflows that return or fail
terminally would never read the checkpoint).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
24afe33 to
d283ef5
Compare
mabdinur
left a comment
There was a problem hiding this comment.
left some nits, only blocking comment is about using native threads. From a tracing stand point the direction looks good. Thanks for driving this
…ableContext and has .state
…tability issues in forked enviornments
quinna-h
left a comment
There was a problem hiding this comment.
Left one comment/concern, overall LGTM
|
/merge |
|
View all feedbacks in Devflow UI.
This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
devflow unqueued this merge request: It did not become mergeable within the expected time |
Description
https://datadoghq.atlassian.net/browse/APMSVLS-493
This adds Datadog trace-context checkpointing for AWS Durable Execution workflows so traces can continue across Lambda invocations when a workflow suspends and later resumes.
When a durable handler raises
SuspendExecution, the integration now appends an async synthetic_datadog_{N}STEP containing the current propagation headers. On replay,datadog-lambda-pythoncan read the latest checkpoint and reactivate the trace context before the workflow continues. And that part is in the datadog-lambda-python PR#818The checkpoint writer also:
Testing
tests/contrib/aws_durable_execution_sdk_python/test_trace_checkpoint.pyfor header stabilization, parent-id anchoring/reuse,traceparentrewriting, replay diff suppression, checkpoint numbering, concurrent allocation, and no-op failure paths.Risks
Low to medium. This only writes Datadog checkpoint metadata on the
SuspendExecutionpath, and failures in the checkpoint writer are swallowed so workflow execution should not be affected.The main behavioral risks are:
_datadog_*prefixstate.operations,create_checkpoint, or operation parent IDsNote
To avoid creating too many checkpoints, we are excluding the
dd=p:part when comparing the tracecontext.Diffing the new headers against the stored payload of the highest-N existing
_datadog_*operation suppresses redundant writes; onlyx-datadog-parent-idand thedd=p:entry oftracestateare stripped before comparison so sampling priority, decision-maker, origin, and propagation tags still trigger a fresh save when they change.