Skip to content

Only increment stats when the worker acknowledged the test#373

Merged
kangze-jia merged 3 commits intomainfrom
cbruckmayer/only-increment-on-ack-v2
Feb 26, 2026
Merged

Only increment stats when the worker acknowledged the test#373
kangze-jia merged 3 commits intomainfrom
cbruckmayer/only-increment-on-ack-v2

Conversation

@ChrisBr
Copy link
Contributor

@ChrisBr ChrisBr commented Feb 8, 2026

Only increment error stats when the worker acknowledged the test otherwise we end up with an incorrect counter.

@ChrisBr ChrisBr force-pushed the cbruckmayer/only-increment-on-ack-v2 branch from a9c8024 to 14cf9e8 Compare February 9, 2026 10:37
@kangze-jia kangze-jia force-pushed the cbruckmayer/only-increment-on-ack-v2 branch 2 times, most recently from b5df285 to b1ea42b Compare February 14, 2026 03:31
ChrisBr and others added 3 commits February 22, 2026 10:52
…place

- Record stats only when worker acknowledges; duplicate acks do not increment
- Redis: record_stats_delta (HINCRBY); record_success returns true when ack'd or replaced
- Stat correction when success replaces failure; real assertion count (test.assertions) in delta
- Test helper: Requeue before Skip when both set; test_aggregation and integration expectations updated
- Remove [stats] debug logging from Redis BuildRecord; test_redis_reporter assertions = 8
@thadcraft-shopify
Copy link

@kangze-jia How do we test this in an actual pipeline run?

@thadcraft-shopify thadcraft-shopify self-requested a review February 25, 2026 02:19
@kangze-jia
Copy link
Contributor

@kangze-jia How do we test this in an actual pipeline run?

Good question.

Here are my thoughts:

  1. Since we’re not changing the error-reporting path (the “FAILED TESTS SUMMARY:” section), we can use it as a baseline to validate the log stats against the error report output. Ideally we should add an alarm, but we can start with a manual check first.
  2. For each agent, verify whether it had any test failures (manual search). If it did, check whether the failures were successfully retried: If the retry succeeds, the log stats should be zero. If the retry still fails, the log stats should reflect that failure.

This will require some manual sampling across a few Buildkite builds though.

@thadcraft-shopify
Copy link

@kangze-jia How do we test this in an actual pipeline run?

Good question.

Here are my thoughts:

  1. Since we’re not changing the error-reporting path (the “FAILED TESTS SUMMARY:” section), we can use it as a baseline to validate the log stats against the error report output. Ideally we should add an alarm, but we can start with a manual check first.
  2. For each agent, verify whether it had any test failures (manual search). If it did, check whether the failures were successfully retried: If the retry succeeds, the log stats should be zero. If the retry still fails, the log stats should reflect that failure.

This will require some manual sampling across a few Buildkite builds though.

I think I am trying to understand how we get these code changes into a test pipeline before we merge this

@kangze-jia
Copy link
Contributor

kangze-jia commented Feb 25, 2026

@kangze-jia How do we test this in an actual pipeline run?

Good question.
Here are my thoughts:

  1. Since we’re not changing the error-reporting path (the “FAILED TESTS SUMMARY:” section), we can use it as a baseline to validate the log stats against the error report output. Ideally we should add an alarm, but we can start with a manual check first.
  2. For each agent, verify whether it had any test failures (manual search). If it did, check whether the failures were successfully retried: If the retry succeeds, the log stats should be zero. If the retry still fails, the log stats should reflect that failure.

This will require some manual sampling across a few Buildkite builds though.

I think I am trying to understand how we get these code changes into a test pipeline before we merge this

Got it.

I created a branch (trigger-ci-status-test) which hit my personal ci-queue branch by updating Gemfile.lock file (https://app.graphite.com/github/pr/shop/world/419165/Add-no-op-comment-to-trigger-CI-selective-tests%3B-include-Gemfile-changes) and scheduled a job to run that branch: https://buildkite.com/shopify/world-shopify-selective-tests/builds?branch=trigger-ci-status-test&page=7

I checked the log stats which look good to me.

@kangze-jia kangze-jia force-pushed the cbruckmayer/only-increment-on-ack-v2 branch from b1ea42b to 158f5a4 Compare February 26, 2026 20:01
@kangze-jia kangze-jia merged commit f350805 into main Feb 26, 2026
14 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants