-
-
Notifications
You must be signed in to change notification settings - Fork 45
feat: Add priority field to Flavor #469
Copy link
Copy link
Open
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.needs-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.
Metadata
Metadata
Assignees
Labels
featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.needs-priorityIndicates a PR lacks a label and requires one.Indicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.Indicates an issue or PR lacks a label and requires one.
What would you like to be added:
apiVersion: llmaz.io/v1alpha1 kind: OpenModel metadata: name: opt-125m spec: familyName: opt source: modelHub: modelID: facebook/opt-125m inferenceConfig: flavors: - name: h800 priority: 5 # higher priority nodeSelector: karpenter.k8s.aws/instance-gpu-name: h800 limits: nvidia.com/gpu: 4 - name: h100 priority: 4 nodeSelector: karpenter.k8s.aws/instance-gpu-name: h100 limits: nvidia.com/gpu: 4 - name: a100 priority: 3 nodeSelector: karpenter.k8s.aws/instance-gpu-name: a100 limits: nvidia.com/gpu: 4 - name: a20 priority: 2 nodeSelector: karpenter.k8s.aws/instance-gpu-name: a20 limits: nvidia.com/gpu: 4 - name: t4 priority: 1 # lower priority nodeSelector: karpenter.k8s.aws/instance-gpu-name: t4 limits: nvidia.com/gpu: 4Why is this needed:
When multiple flavors are defined for a model, there is currently no explicit way to control their matching order during scheduling. The scheduler uses the order defined in the list, which may not reflect the intended preference.
https://github.com/InftyAI/scheduler-plugins/blob/685a4d9f8a769f7f5634a6680f374a05c72823cd/pkg/plugins/resource_fungibility/resource_fungibility.go#L228-L248
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.