mirror of
https://github.com/recp/cglm.git
synced 2025-10-03 16:51:35 +00:00
According to [emscripten](https://emscripten.org/docs/porting/simd.html) and [v8](b6520eda5e/src/compiler/backend/x64/code-generator-x64.cc (L2661-L2699)
), `[f32x4|f64x2].[min|max]` compiles to much more instructions than `[f32x4|f64x2].[pmin|pmax]`. It is defined in [spec](https://github.com/WebAssembly/spec/blob/main/proposals/simd/SIMD.md#floating-point-min-and-max) that the difference between pmin/pmax and min/max is NaN-propagating behavior, and the equivalent to the x86 `_mm_min_ps`/`_mm_max_ps` is pmin/pmax in [v8](b6520eda5e/src/compiler/backend/x64/code-generator-x64.cc (L2740-L2747)
). This should make functions with min/max faster on webassembly, and align with the existing behavior with x86 sse.