[ Server ] Metrics

Recent Posts

Tags more

관리 메뉴

취미가 좋다

Data Engineer/triton inference server

benlee73 2021. 6. 24. 11:22

GPU 와 request 통계를 나타내는 Prometheus metrics 를 제공한다.

일반적으로 http://localhost:8002/metrics 를 통해 접근한다.

metrics는 endpoint를 통해서만 접근이 가능하고, 어떤 원격 서버에도 보내지지 않는다.

메트릭 포맷이 plain text라서, 아래의 코드를 통해 직접 볼 수도 있다.

$ curl localhost:8002/metrics

tritonserver --allow-metrics=false 옵션을 사용하여 메트릭을 비활성화 할 수 있다.

--allow-gpu-metrics=false 옵션으로 gpu 메트릭만 비활성화 할 수 있다.

--metrics-port 옵션으로 다른 포트를 선택할 수도 있다.

아래의 표가 메트릭을 요약한 것이다.

Category	Metric	Description	Granularity	Frequency
GPU Utilization	Power Usage	GPU instantaneous power	Per GPU	Per second
	Power Limit	Maximum GPU power limit	Per GPU	Per second
	Energy Consumption	GPU energy consumption in joules since Triton started	Per GPU	Per second
	GPU Utilization	GPU utilization rate (0.0 - 1.0)	Per GPU	Per second
GPU Memory	GPU Total Memory	Total GPU memory, in bytes	Per GPU	Per second
	GPU Used Memory	Used GPU memory, in bytes	Per GPU	Per second
Count	Request Count	Number of inference requests	Per model	Per request
	Execution Count	Number of inference executions (request count / execution count = average dynamic batch size)	Per model	Per request
	Inference Count	Number of inferences performed (one request counts as "batch size" inferences)	Per model	Per request
Latency	Request Time	Cumulative end-to-end inference request handling time	Per model	Per request
	Queue Time	Cumulative time requests spend waiting in the scheduling queue	Per model	Per request
	Compute Input Time	Cumulative time requests spend processing inference inputs (in the framework backend)	Per model	Per request
	Compute Time	Cumulative time requests spend executing the inference model (in the framework backend)	Per model	Per request
	Compute Output Time	Cumulative time requests spend processing inference outputs (in the framework backend)	Per model	Per request

Triton Inference Server Backend (0)	2021.06.24
[ Server ] Trace (0)	2021.06.24
[ Server ] Performance Analyzer (0)	2021.06.24
[ Server ] Architecture (0)	2021.06.24
[ Server ] Model Configuration (0)	2021.06.23

'Data Engineer/triton inference server' Related Articles

Comments