HDDS-14730. Update Recon container sync to use container IDs#9842
HDDS-14730. Update Recon container sync to use container IDs#9842jasonosullivan34 wants to merge 4 commits intoapache:masterfrom
Conversation
devmadhuu
left a comment
There was a problem hiding this comment.
Thanks @jasonosullivan34 for the patch. Few comments in line with code. Pls check.
Also please add in PR description , how the patch was tested. Also better write following tests:
-
A unit test for ContainerStateMap.getContainerIDs(state, start, count) verifying pagination and state filtering
-
A unit or integration test for syncWithSCMContainerInfo() covering the "container missing from Recon, add it" path, and the "container already present, skip it" path
| * @return the list of containers from SCM in a given state | ||
| * @throws IOException | ||
| */ | ||
| List<ContainerInfo> getListOfContainers(long startContainerID, |
There was a problem hiding this comment.
This method is unused now. We can remove it if no longer used.
| } | ||
|
|
||
| @Override | ||
| public List<ContainerInfo> getListOfContainers( |
There was a problem hiding this comment.
This method is unused now. We can remove it if no longer used.
| @Override | ||
| public List<ContainerID> getListOfContainerIDs( | ||
| ContainerID startContainerID, int count, HddsProtos.LifeCycleState state) | ||
| throws IOException { |
There was a problem hiding this comment.
Kindly check if this method needs to throw IOException. Looks like method throws Exception.
| ContainerID startContainerID, int count, HddsProtos.LifeCycleState state) | ||
| throws IOException; | ||
|
|
||
| List<ContainerInfo> getListOfContainers( |
|
|
||
| public GetContainerCountResponseProto getClosedContainerCount( | ||
| StorageContainerLocationProtocolProtos.GetContainerCountRequestProto | ||
| GetContainerCountRequestProto |
There was a problem hiding this comment.
Some of the changes in this class are showing due to formatting. Please revert them as not related to PR change.
| if (isSyncDataFromSCMRunning.compareAndSet(false, true)) { | ||
| try { | ||
| List<ContainerInfo> containers = containerManager.getContainers(); | ||
| List<ContainerID> containerIDs = containerManager.getContainerIDs(); |
There was a problem hiding this comment.
We need not to load this list in memory, containerManager.getContainerIDs() loads all container IDs from Recon's local ContainerStateManager into a List upfront. Then for every SCM-returned ID, List.contains() does an O(n) linear scan. On a cluster with a million containers, this is O(n²) work for a full sync.
The ContainerManager interface already exposes containerExist(ContainerID id), which does a direct O(log n) map lookup without materializing any list.
| listOfContainers.forEach(containerID -> { | ||
| boolean isContainerPresentAtRecon = | ||
| containers.contains(containerInfo); | ||
| containerIDs.contains(containerID); |
There was a problem hiding this comment.
| containerIDs.contains(containerID); | |
| boolean isContainerPresentAtRecon = containerManager.containerExist(containerID); |
There was a problem hiding this comment.
This can be optimized above way.
| SCMListContainerIDsResponseProto.newBuilder(); | ||
|
|
||
| List<ContainerID> containerIDs = impl.getListOfContainerIDs( | ||
| startContainerID, request.getCount(), request.getState()); |
There was a problem hiding this comment.
state is declared optional in the proto. In proto2 Java bindings, calling getState() on an unset optional enum field returns the first enum value (OPEN), not null. So should use below:
if (request.hasState()) {
state = request.getState();
}
Without this guard, a caller that omits the state (e.g., future tooling, a different client version) will silently receive only OPEN containers instead of all containers, which would be a silent correctness bug.
| try { | ||
| List<ContainerID> results = scm.getContainerManager().getContainerIDs( | ||
| startContainerID, count, state); | ||
| AUDIT.logReadSuccess(buildAuditMessageForSuccess( |
There was a problem hiding this comment.
Using same Audit actions. SCMAction.LIST_CONTAINER. Two semantically different operations in audit logs, making it harder to distinguish ID-only queries from full container data queries.
|
|
||
| /** | ||
| * Returns container IDs under certain conditions. | ||
| * Search container IDs from start ID(exclusive), |
There was a problem hiding this comment.
ContainerStateMap.getContainerIDs() uses tailMap(start) which is inclusive. Pls correct the javadoc.
What changes were proposed in this pull request?
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14730
How was this patch tested?
(Please explain how this patch was tested. Ex: unit tests, manual tests, workflow run on the fork git repo.)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this.)
@devmadhuu