Skip to content

[WebGPU] Fix nearest half-tie handling for round_prefer_ceil/floor#28757

Open
haoxli wants to merge 3 commits into
microsoft:mainfrom
haoxli:fix-nearest-half-tie
Open

[WebGPU] Fix nearest half-tie handling for round_prefer_ceil/floor#28757
haoxli wants to merge 3 commits into
microsoft:mainfrom
haoxli:fix-nearest-half-tie

Conversation

@haoxli
Copy link
Copy Markdown

@haoxli haoxli commented Jun 3, 2026

Description

ResizeOpTest.ResizeOpNearestUpSample_RoundPreferCeil_HalfPixel_2x2to7x8 fails on Intel WebGPU devices:

error: The difference between cur_expected[i] and cur_actual[i] is 2, which exceeds tolerance, where
cur_expected[i] evaluates to 3,
cur_actual[i] evaluates to 1, and
tolerance evaluates to 0.00030999997397884727.

Motivation and Context

The WebGPU shader logic used exact half-tie equality checks combined with i32 truncation arithmetic. This approach is fragile under GPU floating-point precision constraints and misbehaves on negative halfway coordinates.

This change replaces the exact comparisons with a robust, epsilon-based fractional tie check (<= 1e-6) for both ROUND_PREFER_CEIL and ROUND_PREFER_FLOOR execution paths.

Also add regression coverage for round_prefer_floor in resize_op_test.cc:

  • ResizeOpNearestUpSample_RoundPreferFloor_HalfPixel_2x2to7x8
  • ResizeOpNearestUpSample_RoundPreferFloor_HalfPixel_GH28291_Regression

ResizeOpTest.ResizeOpNearestUpSample_RoundPreferCeil_HalfPixel_2x2to7x8
fails on Intel WebGPU devices:
error: The difference between cur_expected[i] and cur_actual[i] is 2,
which exceeds tolerance, where
cur_expected[i] evaluates to 3,
cur_actual[i] evaluates to 1, and
tolerance evaluates to 0.00030999997397884727.

The WebGPU shader logic used exact half-tie equality checks combined
with i32 truncation arithmetic. This approach is fragile under GPU
floating-point precision constraints and misbehaves on negative halfway
coordinates.

This change replaces the exact comparisons with a robust, epsilon-based
fractional tie check (<= 1e-6) for both ROUND_PREFER_CEIL and
ROUND_PREFER_FLOOR execution paths.

Also add regression coverage for round_prefer_floor in resize_op_test.cc:
- ResizeOpNearestUpSample_RoundPreferFloor_HalfPixel_2x2to7x8
- ResizeOpNearestUpSample_RoundPreferFloor_HalfPixel_GH28291_Regression
xadupre
xadupre previously approved these changes Jun 3, 2026
Comment thread onnxruntime/core/providers/webgpu/tensor/resize_impl.cc Outdated
@haoxli
Copy link
Copy Markdown
Author

haoxli commented Jun 4, 2026

CI failure due to OOM:

1: AddressSanitizer: Out of memory. The process has exhausted 8192MB for size class 8192.
1: =================================================================
1: ==18400==ERROR: AddressSanitizer: allocator is out of memory trying to allocate 0x2000 bytes
1: AddressSanitizer: nested bug in the same thread, aborting.

3/4 Test #1: onnxruntime_test_all .............***Failed  1307.52 sec
Debug (cpuinfo): HTT: APIC ID = 00000018, cores per processor = 32
Debug (cpuinfo): raw CPUID brand string: " AMD EPYC 9V74 80-Core Processor                "
Debug (cpuinfo): detected 1 processor groups
Debug (cpuinfo): detected 32 processors in group 0
Debug (cpuinfo): detected 0 processors before group 0
Debug (cpuinfo): reconstructed APIC ID 0x00000000 for processor 0 in group 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants