Skip to content

nvptx: Incorrect use of LLVM intrinsics for f16x2_min/max(_nan) #2056

@RalfJung

Description

@RalfJung

The nvptx intrinsics f16x2_min/f16x2_max/f16x2_min_nan/f16x2_max_nan are currently being mapped to the LLVM intrinsics minnum/minimum/maxnum/maximum, respectively (in some cases this is indirected via simd_fmin/simd_fmax, which are documented to correspond to minnum nsz/maxnum nsz, but we currently don't actually emit the nsz attribute). See here for an overview of the LLVM float min/max operations.

This is incorrect:

  • According to the docs, the behavior for signed zeros is defined by (a < b) ? a : b, i.e., when both operands compare equal, the 2nd operand is returned. That's not what any of the LLVM intrinsics does: they either treat -0.0 as smaller than +0.0 (that's the default), or return either value non-deterministically (when the nsz attribute is present). [This means it is actually a bug that LLVM uses the min.f16x2 nvptx operation for lowering minnum...]
  • According to the docs, assuming that isNaN checks for both QNaN and SNaN, if exactly one input is any NaN, the other input is returned for f16x2_min/f16x2_max. In contrast, minnum/maxnum say that when an input is SNaN, the return value is a NaN or the other input. The LLVM variant with the correct NaN semantics is minimumnum/maximumnum.

Cc @kjetilkjeka @folkertdev

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions