From 745021ecabfa1f6cbf8dc475326cc1cfcd5d825e Mon Sep 17 00:00:00 2001 From: SangHyeon Park <39648636+shyeonn@users.noreply.github.com> Date: Sun, 14 Apr 2024 18:57:00 +0900 Subject: [PATCH 1/2] Fix hps docs typo --- docs/source/hierarchical_parameter_server/profiling_hps.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/hierarchical_parameter_server/profiling_hps.md b/docs/source/hierarchical_parameter_server/profiling_hps.md index 9c80c42474..5802613c9b 100644 --- a/docs/source/hierarchical_parameter_server/profiling_hps.md +++ b/docs/source/hierarchical_parameter_server/profiling_hps.md @@ -194,8 +194,8 @@ perf_analyzer -m your_model_name --collect-metrics -f perf_output.csv --verbose- |--------------------|-----|-----| |Profile client side E2E Pipeline|NO|YES| |Profile sever side key lookup session|YES|YES| -|Pofile the embedding cache component|YES|NO| +|Profile the embedding cache component|YES|NO| |Profile the database backend component|YES|NO| |Support different key distributions|YES|YES| |Concurrency Support|NO|YES| -|GPU/Memory Utilization|NO|YES| \ No newline at end of file +|GPU/Memory Utilization|NO|YES| From 371e1535cc70f079f980c877137f9f13188e2c5f Mon Sep 17 00:00:00 2001 From: SangHyeon Park <39648636+shyeonn@users.noreply.github.com> Date: Thu, 18 Apr 2024 16:48:40 +0900 Subject: [PATCH 2/2] Fix docs hps profiler example argument(powerlaw) no longer support --powerlaw argument change to --distribution powerlaw --- .../source/hierarchical_parameter_server/profiling_hps.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/source/hierarchical_parameter_server/profiling_hps.md b/docs/source/hierarchical_parameter_server/profiling_hps.md index 5802613c9b..55da5af87a 100644 --- a/docs/source/hierarchical_parameter_server/profiling_hps.md +++ b/docs/source/hierarchical_parameter_server/profiling_hps.md @@ -29,7 +29,7 @@ latency over the configurable iteration, and then repeats the measurements until For example, if `--embedding_cache` is used the results will be show below: ``` -$ hps_profiler --iterations 1000 --num_key 2000 --powerlaw --alpha 1.2 --config /hugectr/model/ps.json --table_size 630000 --warmup_iterations 100 --embedding_cache +$ hps_profiler --iterations 1000 --num_key 2000 --distribution powerlaw --alpha 1.2 --config /hugectr/model/ps.json --table_size 630000 --warmup_iterations 100 --embedding_cache ... *** Measurement Results *** @@ -144,7 +144,7 @@ Optional arguments: Measurement example of the HPS Lookup Session ``` -$hps_profiler --iterations 1000 --num_key 2000 --powerlaw --alpha 1.2 --config /hugectr/Model_Samples/wdl/wdl_infer/model/ps.json --table_size 630000 --warmup_iterations 100 --lookup_session +$hps_profiler --iterations 1000 --num_key 2000 --distribution powerlaw --alpha 1.2 --config /hugectr/Model_Samples/wdl/wdl_infer/model/ps.json --table_size 630000 --warmup_iterations 100 --lookup_session ... *** Measurement Results *** The Benchmark of: End-to-end lookup embedding keys for Lookup session @@ -153,14 +153,14 @@ Latencies [900 iterations] min = 0.190813ms, mean = 0.243117ms, median = 0.23808 Measurement example of the HPS Data Backend ``` -$hps_profiler --iterations 1000 --num_key 2000 --powerlaw --alpha 1.2 --config /hugectr/Model_Samples/wdl/wdl_infer/model/ps.json --table_size 630000 --warmup_iterations 100 --database_backend +$hps_profiler --iterations 1000 --num_key 2000 --distribution powerlaw --alpha 1.2 --config /hugectr/Model_Samples/wdl/wdl_infer/model/ps.json --table_size 630000 --warmup_iterations 100 --database_backend ... *** Measurement Results *** The Benchmark of: Lookup the embedding key from default HPS database Backend Latencies [900 iterations] min = 0.075086ms, mean = 0.127312ms, median = 0.121235ms, 95% = 0.166826ms, 99% = 0.219295ms, max = 0.285409ms, throughput = 8248.44/s ``` *`NOTE`*: -1. If the user add the `--powerlaw` option, the queried embedding key will be generated with the specified argument `--alpha = **`. +1. If the user add the `--distribution powerlaw` option, the queried embedding key will be generated with the specified argument `--alpha = **`. 2. If the user add the `--hot_key_percentage=**` and `--hot_key_coverage=xx` options, the queried embedding key will generate the number of `--table_size` * `--hot_key_percentage` keys with this probability of `--hot_key_percentage=**`. For example `--hot_key_percentage=0.01`, `--hot_key_coverage=0.9` and `--table_size=1000`, then the first 1000*0.01=10 keys will appear in the request with a probability of 90%. 3. It is recommended that users make mutually exclusive selections of three components(`--embedding_cache`,`--database_backend` and `--lookup_session`) to ensure the most accurate performance. Because the measurement results of the lookup session will include the performance results of the database backend and embedding cache.