Skip to content

Fix O(N²) entity addition in CallbackGroup (#2942)#3166

Open
PavelGuzenfeld wants to merge 3 commits into
ros2:rollingfrom
PavelGuzenfeld:fix/callback-group-quadratic-add-2942
Open

Fix O(N²) entity addition in CallbackGroup (#2942)#3166
PavelGuzenfeld wants to merge 3 commits into
ros2:rollingfrom
PavelGuzenfeld:fix/callback-group-quadratic-add-2942

Conversation

@PavelGuzenfeld
Copy link
Copy Markdown

@PavelGuzenfeld PavelGuzenfeld commented Jun 6, 2026

Fixes #2942

The add_* methods (add_timer, add_subscription, add_service, add_client, add_waitable) scanned the whole entity list to drop expired weak_ptrs on every call, so adding N entities was O(N²). In the #2942 reproducer (10,000 timers) that scan is ~96% of the wall time.

This moves the expired-entry cleanup out of the add_* methods — which become a plain push_back — and into collect_all_ptrs, which already walks every entry on each executor spin and so can compact in the same pass at no extra cost (erase/remove_if). collect_all_ptrs is now non-const; there's no public API or ABI change.

It's safe because collect_all_ptrs already skipped expired entries and now also removes them, the executor calls it every spin so cleanup stays prompt, and the find_*_ptrs_if helpers already tolerate expired weak_ptrs.

Reproducer from #2942 (10,000 timers): 429 ms → 6 ms.

Adds test_callback_group_entity_cleanup.cpp covering compaction during collect_all_ptrs, live-only collection, non-quadratic insertion, interleaved add/remove, and mixed entity types.


Some of this change was produced with Claude Opus 4.6 (Anthropic).

Move expired weak_ptr cleanup from add_* methods (O(N) per call, O(N²)
total when adding N entities) into collect_all_ptrs via a new
collect_and_compact helper. collect_all_ptrs already iterates all
entries on every executor spin, so cleanup happens at zero extra cost.

The add_* methods become O(1) (just push_back). This fixes the
quadratic slowdown reported in ros2#2942 (429ms → 6ms for 10,000 timers).

This change is safe because:
- collect_all_ptrs already skipped expired entries; now it also removes
  them in the same pass
- The executor calls collect_all_ptrs on every spin iteration, so
  expired entries are cleaned up promptly
- All find_*_ptrs_if methods already handle expired weak_ptrs gracefully

Fixes ros2#2942

Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
Generated-by: Claude Opus 4.6 (Anthropic)
Tests verify that:
- Expired weak_ptrs are compacted during collect_all_ptrs
- collect_all_ptrs only yields live entities
- Adding N timers is not quadratic (5000 timers in < 5s)
- Interleaved add/remove cycles produce correct entity counts
- Mixed entity types (timers + subscriptions) are cleaned independently

Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
Generated-by: Claude Opus 4.6 (Anthropic)
- Remove mutable from entity vectors, make collect_all_ptrs non-const
- Replace collect_and_compact helper with std::remove_if (erase-remove idiom)
- Remove leaked test_time_source_deadlock CMake entry from another branch
- Fix uncrustify formatting in test file

Signed-off-by: Pavel Guzenfeld <pavelguzenfeld@gmail.com>
Generated-by: Claude Opus 4.6 (Anthropic)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Why does adding N entities to a CallbackGroup take N^2, i.e. quadratic time ?

1 participant