[RMP] Improve the speed of training retrieval models with Merlin Models #259

karlhigley · 2022-05-03T14:25:22Z

Problem:

Merlin models needs to differentiate itself relative to other RecSys library solutions. One of those areas of differentiation needs to be performance on the GPU. If our libraries don't follow best practices and achieve fast performance that we can measure on GPU then our potential customers have no reason to use the library.

Goal:

Provide performant retrieval models in production
Follow best practices by our colleagues for GPU optimization

Constraints

Merlin models is built on top of Tensorflow

Possible Optimizations

Retrieval models

EvenOldridge · 2022-07-04T22:25:26Z

@gabrielspmoreira @marcromeyn Where are we at with the perf regressions we were seeing? Can we close this?

viswa-nvidia · 2022-10-26T17:28:21Z

Check this bug is done NVIDIA-Merlin/models#339 and this issue should be closed. This is an ongoing effort. The profiling portion will be spun off as a separate RMP ticket

gabrielspmoreira · 2022-10-26T19:09:22Z

@EvenOldridge @viswa-nvidia As we discussed in the Grooming meeting today, I have tested the pending runtime issue of retrieval models training (NVIDIA-Merlin/models#339) with the current implementation (for both V1 and V2) and it doesn't occur anymore. So that bug was closed.
I also extracted the profiling task of retrieval model pipelines from this RMP to a new RMP #709 , which also addresses ranking model pipelines, as profiling tasks will require an external support (Valerie).
So I am closing this RMP, as it already delivers value with the finished perf improvements on retrieval models.

karlhigley added the epic label May 3, 2022

karlhigley added this to the Merlin 22.05 milestone May 3, 2022

karlhigley added the roadmap label May 3, 2022

karlhigley mentioned this issue May 3, 2022

[RMP] Performance evaluation and improvements for model training and serving NVIDIA-Merlin/models#346

Closed

8 tasks

karlhigley assigned benfred, gabrielspmoreira and marcromeyn May 3, 2022

viswa-nvidia changed the title ~~[RMP] Performance optimization of model training and serving~~ [RMP] Performance optimization of model training May 25, 2022

karlhigley changed the title ~~[RMP] Performance optimization of model training~~ [RMP] Improve the speed of training models with Merlin Models May 25, 2022

rnyak removed this from the Merlin 22.05 milestone Jun 22, 2022

gabrielspmoreira changed the title ~~[RMP] Improve the speed of training models with Merlin Models~~ [RMP] Improve the speed of training retrieval models with Merlin Models Oct 26, 2022

gabrielspmoreira added this to the Merlin 22.11 milestone Oct 26, 2022

gabrielspmoreira closed this as completed Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RMP] Improve the speed of training retrieval models with Merlin Models #259

[RMP] Improve the speed of training retrieval models with Merlin Models #259

karlhigley commented May 3, 2022 •

edited by gabrielspmoreira

Loading

EvenOldridge commented Jul 4, 2022

viswa-nvidia commented Oct 26, 2022

gabrielspmoreira commented Oct 26, 2022 •

edited

Loading

[RMP] Improve the speed of training retrieval models with Merlin Models #259

[RMP] Improve the speed of training retrieval models with Merlin Models #259

Comments

karlhigley commented May 3, 2022 • edited by gabrielspmoreira Loading

Problem:

Goal:

Constraints

Possible Optimizations

Retrieval models

EvenOldridge commented Jul 4, 2022

viswa-nvidia commented Oct 26, 2022

gabrielspmoreira commented Oct 26, 2022 • edited Loading

karlhigley commented May 3, 2022 •

edited by gabrielspmoreira

Loading

gabrielspmoreira commented Oct 26, 2022 •

edited

Loading