allow external hpa by robinvd · Pull Request #520 · kserve/modelmesh-serving

robinvd · 2024-07-16T11:04:14Z

Motivation

We need automatic scaling for servingruntimes based on some gpu load metric. The default deployment included metrics including prometheus. Using keda we want to scale based on prometheus queries.

The current options for scaling are

static based on replicas in the servingruntime/config
dynamic using hpa annotation. But this creates a managed hpa that only works with the buildin kubernetes metrics. Adding another hpa will conflict with the already created one.

This PR adds the external option where the controller wont set the replicas and wont create an hpa.

Modifications

The already existing External AutoscalerClass is now checked for instead of crashing the controller. The behavior is the same as None except it does not set the replicas property.

Result

potentially/partially solves #372

oss-prow-bot · 2024-07-16T11:04:18Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: robinvd
Once this PR has been reviewed and has the lgtm label, please assign rafvasq for approval by writing /assign @rafvasq in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

oss-prow-bot · 2024-07-16T11:04:18Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: robinvd
Once this PR has been reviewed and has the lgtm label, please assign rafvasq for approval by writing /assign @rafvasq in a comment. For more information see:The Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Robin <robinvandijk@klippa.com>

vinismarques · 2025-03-05T15:03:14Z

Hey @robinvd, have you been using HPA based on GPU load successfully? I would love to see this being integrated.

robinvd · 2025-03-10T20:16:46Z

@vinismarques yes using this setup im running an autoscaling setup based on gpu load.

oss-prow-bot bot requested review from ckadner, njhill and rafvasq July 16, 2024 11:04

oss-prow-bot bot requested a review from rafvasq July 16, 2024 11:04

robinvd force-pushed the allow-external-hpa branch from f38c91d to 2ea3c3c Compare July 16, 2024 12:07

allow external hpa

8e46c1c

Signed-off-by: Robin <robinvandijk@klippa.com>

robinvd force-pushed the allow-external-hpa branch from 2ea3c3c to 8e46c1c Compare July 16, 2024 12:08

vinismarques mentioned this pull request Mar 5, 2025

ServingRuntime autoscaling monitoring GPU utilization #372

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow external hpa#520

allow external hpa#520
robinvd wants to merge 1 commit intokserve:mainfrom
robinvd:allow-external-hpa

robinvd commented Jul 16, 2024 •

edited

Loading

Uh oh!

oss-prow-bot bot commented Jul 16, 2024

Uh oh!

oss-prow-bot bot commented Jul 16, 2024

Uh oh!

vinismarques commented Mar 5, 2025

Uh oh!

robinvd commented Mar 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

robinvd commented Jul 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Result

Uh oh!

oss-prow-bot bot commented Jul 16, 2024

Uh oh!

oss-prow-bot bot commented Jul 16, 2024

Uh oh!

vinismarques commented Mar 5, 2025

Uh oh!

robinvd commented Mar 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

robinvd commented Jul 16, 2024 •

edited

Loading