While investigating high CPU usage in a roachtest failure (#153772), we identified a performance bottleneck in errors.Is.
The profile (cpuprof.2025-09-19T09_06_36.440.76.pprof.zip) shows the expensive checking(from a call in the SQL connection handling code).
Of course, most of the time there are no errors, so this code path is only exercised when there already is some other issue occurring. But it does seem like this would make any pre-existing issue worse.
Interestingly, this comment in the cockroachdb/errors implementation mentions that this could become a performance bottleneck: https://github.com/cockroachdb/errors/blob/2008f7c3ac42391b26a4bbb1cf0bbf7057605610/markers/mark
I wrote a benchmark in #154222 to measure this function call with different kinds of errors.
goos: darwin
goarch: arm64
cpu: Apple M1 Pro
=== RUN BenchmarkErrorsIs
BenchmarkErrorsIs
=== RUN BenchmarkErrorsIs/SimpleError
BenchmarkErrorsIs/SimpleError
BenchmarkErrorsIs/SimpleError 1618170 716.0 ns/op 448 B/op 14 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/WrappedError
BenchmarkErrorsIs/WrappedError
BenchmarkErrorsIs/WrappedError 300048 4009 ns/op 3344 B/op 67 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/WrappedWithStack
BenchmarkErrorsIs/WrappedWithStack
BenchmarkErrorsIs/WrappedWithStack 941736 1179 ns/op 896 B/op 23 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/NetworkError
BenchmarkErrorsIs/NetworkError
BenchmarkErrorsIs/NetworkError 414190 2790 ns/op 1768 B/op 52 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/DeeplyWrappedNetworkError
BenchmarkErrorsIs/DeeplyWrappedNetworkError
BenchmarkErrorsIs/DeeplyWrappedNetworkError 44353 27774 ns/op 23419 B/op 381 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/MultipleWrappedErrors
BenchmarkErrorsIs/MultipleWrappedErrors
BenchmarkErrorsIs/MultipleWrappedErrors 65017 18352 ns/op 20259 B/op 275 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/NetworkErrorWithLongAddress
BenchmarkErrorsIs/NetworkErrorWithLongAddress
BenchmarkErrorsIs/NetworkErrorWithLongAddress 111456 10746 ns/op 8145 B/op 150 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/WithMessage
BenchmarkErrorsIs/WithMessage
BenchmarkErrorsIs/WithMessage 520652 2595 ns/op 1880 B/op 40 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/MultipleWithMessage
BenchmarkErrorsIs/MultipleWithMessage
BenchmarkErrorsIs/MultipleWithMessage 107244 9438 ns/op 6929 B/op 122 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/WithMessageAndStack
BenchmarkErrorsIs/WithMessageAndStack
BenchmarkErrorsIs/WithMessageAndStack 215155 5640 ns/op 5296 B/op 87 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/NetworkErrorWithMessage
BenchmarkErrorsIs/NetworkErrorWithMessage
BenchmarkErrorsIs/NetworkErrorWithMessage 184285 6572 ns/op 4648 B/op 99 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/NetworkErrorWithEverything
BenchmarkErrorsIs/NetworkErrorWithEverything
BenchmarkErrorsIs/NetworkErrorWithEverything 25274 46921 ns/op 35885 B/op 594 allocs/op
=== NAME
=== RUN BenchmarkErrorsIs/DeeplyNested100Levels
BenchmarkErrorsIs/DeeplyNested100Levels
BenchmarkErrorsIs/DeeplyNested100Levels 256 4702923 ns/op 5931739 B/op 62712 allocs/op
=== NAME
Some ideas for improving this are discussed in Slack.
Jira issue: CRDB-54997
While investigating high CPU usage in a roachtest failure (#153772), we identified a performance bottleneck in
errors.Is.The profile (cpuprof.2025-09-19T09_06_36.440.76.pprof.zip) shows the expensive checking(from a call in the SQL connection handling code).
Of course, most of the time there are no errors, so this code path is only exercised when there already is some other issue occurring. But it does seem like this would make any pre-existing issue worse.
Interestingly, this comment in the
cockroachdb/errorsimplementation mentions that this could become a performance bottleneck: https://github.com/cockroachdb/errors/blob/2008f7c3ac42391b26a4bbb1cf0bbf7057605610/markers/markI wrote a benchmark in #154222 to measure this function call with different kinds of errors.
Some ideas for improving this are discussed in Slack.
Jira issue: CRDB-54997