Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Performance metrics per model-version #1970

Closed
vitalyli opened this issue Jan 27, 2022 · 5 comments
Closed

Feature request: Performance metrics per model-version #1970

vitalyli opened this issue Jan 27, 2022 · 5 comments
Assignees
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response type:feature

Comments

@vitalyli
Copy link
Contributor

vitalyli commented Jan 27, 2022


Feature Request

Describe the problem the feature is intended to solve

We have multiple AB running, where the same model_name can have different versions,
which could have different performance outcomes.

For instance, the same model with the same inputs can have different number of layers or
architecture, which can make it slower and heavier especially for "on CPU" processing.

We need to have a way to monitor and get perf. metrics such as model p95 and average latency,
at model_name.version granularity, while currently, all that is visible is model_name level metrics.

:tensorflow:serving:request_latency_bucket{model_name="tf_model_name",API="Predict",entrypoint="GRPC",le="2.52873e+08"} 16237

Describe the solution

Solution to this is to have one more set of performance counters inside servable, to count p95 and average
time at more granular level of model-version.

Describe alternatives you've considered

Only way we can see right now, is to execute a call from the client and measure latency this way,
however that includes round trip latency and feature engineering requirements, that are specific to a given
model-version, thus making it operationally challenging at scale and maintenance headache, while still
not giving us pure server side metrics per model-version.

System information

  • **OS Platform and Distribution: CentOS 7; later OEL 8
  • TensorFlow Serving installed from (source or binary): source
  • TensorFlow Serving version: 2.6
@vitalyli
Copy link
Contributor Author

@godot73 Any opinion on this. Thanks!

@vitalyli
Copy link
Contributor Author

@google is this not feasible or just nobody else asked before?

@singhniraj08 singhniraj08 assigned nniuzft and unassigned godot73 Feb 17, 2023
@singhniraj08 singhniraj08 self-assigned this Jun 23, 2023
@singhniraj08
Copy link

@vitalyli,

Similar feature request #1959 in progress. Requesting you to close this issue and follow similar thread for updates.
Thank you.

@github-actions
Copy link

github-actions bot commented Jul 1, 2023

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jul 1, 2023
@github-actions
Copy link

github-actions bot commented Jul 9, 2023

This issue was closed due to lack of activity after being marked stale for past 7 days.

@github-actions github-actions bot closed this as completed Jul 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response type:feature
Projects
None yet
Development

No branches or pull requests

5 participants