Skip to content

Latest commit

 

History

History
32 lines (15 loc) · 828 Bytes

File metadata and controls

32 lines (15 loc) · 828 Bytes

Metric

Enum

  • CPU (value: 'cpu')

  • MEMORY (value: 'memory')

  • GPU (value: 'gpu')

  • HTTP_REQUESTS (value: 'http_requests')

  • HTTP_REQUESTS_BY_STATUS (value: 'http_requests_by_status')

  • ERROR_CODE (value: 'error_code')

  • REQUEST_LATENCY_50_PERCENTILE (value: 'request_latency_50_percentile')

  • REQUEST_LATENCY_90_PERCENTILE (value: 'request_latency_90_percentile')

  • REQUEST_LATENCY_99_PERCENTILE (value: 'request_latency_99_percentile')

  • TOKENS_PER_SECOND (value: 'tokens_per_second')

  • TIME_TO_FIRST_TOKEN (value: 'time_to_first_token')

  • PREFIX_CACHE_HIT_RATE (value: 'prefix_cache_hit_rate')

[Back to Model list] [Back to API list] [Back to README]