Skip to content

Reconcile Sync period modifications & ListVolume enhancements to support 120k CNS Volumes#3909

Open
chethanv28 wants to merge 1 commit intokubernetes-sigs:masterfrom
chethanv28:topic/chethanv28/modify-listvolume-sync-duration
Open

Reconcile Sync period modifications & ListVolume enhancements to support 120k CNS Volumes#3909
chethanv28 wants to merge 1 commit intokubernetes-sigs:masterfrom
chethanv28:topic/chethanv28/modify-listvolume-sync-duration

Conversation

@chethanv28
Copy link
Collaborator

@chethanv28 chethanv28 commented Feb 26, 2026

What this PR does / why we need it:
The csi-attacher sidecar's VA reconciler runs on a default interval of 1 minute, calling ListVolumes on the CSI driver at every cycle. This flooded the CSI controller logs with noise at a rate of once per minute per cluster.

Additionally, the Supervisor (wcp) ListVolumes implementation accounted for Pod VM workloads, VM Service & VKS VM Service. Pod VMs are ephemeral VMs running inside supervisor namespaces whose disks do not appear in the standard ESXi host → VM hardware scan used to derive published nodes. As a result, PVCs consumed by Pod VMs were silently omitted from ListVolumes responses, giving the csi-attacher an incomplete view of volume-to-node mappings during its reconciliation cycles.

Testing done:
Running e2e pipelines. Results TBA

ListVolume invocation frequency is now reduced.

Before (invoked every minute):

root@420a06edb013734dfdb354729733710b [ ~ ]# kubectl logs vsphere-csi-controller-7b58df5474-nvpq7 -c vsphere-csi-controller -n vmware-system-csi | grep ListVolumes called
grep: called: No such file or directory
root@420a06edb013734dfdb354729733710b [ ~ ]# kubectl logs vsphere-csi-controller-7b58df5474-nvpq7 -c vsphere-csi-controller -n vmware-system-csi | grep "ListVolumes called"
2026-02-25T22:50:40.827Z	DEBUG	wcp/controller.go:2351	ListVolumes called with args , expectedStartingIndex 0	{"TraceId": "42327418-dfa8-4fde-90b0-7542cfadb527"}
2026-02-25T22:51:41.050Z	DEBUG	wcp/controller.go:2351	ListVolumes called with args , expectedStartingIndex 0	{"TraceId": "0eb5a0d5-db6c-49d0-9806-1e78cb3ca8d7"}
2026-02-25T22:52:41.151Z	DEBUG	wcp/controller.go:2351	ListVolumes called with args , expectedStartingIndex 0	{"TraceId": "ccb6f8ed-0dd7-4620-bd54-c088b61e28e5"}

After (invoked every 5 minutes):

kubectl logs vsphere-csi-controller-674f55d64f-h6j6j -c vsphere-csi-controller -n vmware-system-csi | grep "ListVolumes called"
2026-02-25T22:57:31.207Z	DEBUG	wcp/controller.go:2351	ListVolumes called with args , expectedStartingIndex 0	{"TraceId": "dba69bcc-3f44-4dad-a60a-aeb947b258dc"}
2026-02-25T23:02:31.845Z	DEBUG	wcp/controller.go:2351	ListVolumes called with args , expectedStartingIndex 0	{"TraceId": "54339a3d-8f58-4ede-9956-2bca532905fb"}
2026-02-25T23:07:31.988Z	DEBUG	wcp/controller.go:2351	ListVolumes called with args , expectedStartingIndex 0	{"TraceId": "8d1f49d6-994f-4946-b83a-897e41c08417"}

Special notes for your reviewer:

Release note:

Reconcile Sync period modifications & ListVolume enhancements to support 120k CNS Volumes

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chethanv28

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 26, 2026
@chethanv28 chethanv28 added 120k Volumes Support Support up to 120k Volumes and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Feb 26, 2026
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 26, 2026
… API in the wcp controller to look for volumeattachment status for Pod VM PVC & PV
@chethanv28 chethanv28 force-pushed the topic/chethanv28/modify-listvolume-sync-duration branch from 25247c3 to e204a90 Compare February 26, 2026 20:44
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

120k Volumes Support Support up to 120k Volumes approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants