Mixed precision and Tensorflow XLA #322

dathudeptrai · 2022-08-27T17:55:09Z

dathudeptrai
Aug 27, 2022

It's indispensable that we could make all models be able to take advantage of mixed_precision and xla. In my experience, mixed_precision and xla can speed up training and inference time around two times. Also, I think if you want this repo to become more popular, you should show users how to maximize the performance of TensorFlow. If the training time and inference time of all models here are not competitive with other frameworks then there is no reason for users to use this framework. From what I can see, 95% of TensorFlow users don't know how to maximize the performance of TF.

@mattdangerw @jbischof

Answered by mattdangerw

Aug 29, 2022

Thanks for opening! I think this is a great callout!

We are just progressing to covering modeling workflows in KerasNLP, but I think what you are saying is spot on. We should really strive to have mixed-precision and XLA not just supported, but enabled as the default option wherever possible. They are both huge speedups.

For XLA, I believe that most components where we need support either have it, or there is work happening to support it (e.g. beam search utility).

For mixed precision, we do need to figure out how to instantiate say a BERT network with mixed precision along with checkpointed weights. Looks like @jbischof just opened a bug, so will kick off a comment there. #323

View full answer

jbischof · 2022-08-28T03:21:16Z

jbischof
Aug 28, 2022

Thanks @dathudeptrai, we really value your input! We are currently fleshing out our core API but will be emphasizing both performance and usability in our development. Would you be interested in talking to us about your use case so we can learn more about potential users?

0 replies

dathudeptrai · 2022-08-28T16:40:53Z

dathudeptrai
Aug 28, 2022
Author

@jbischof My use case is just a general use case where I always try to maximize tf performance not only in training but also in deployment (sever and mobile). Below are some important notes:

Recently, the Hugginface TF team succeed to apply XLA for the Text Generation task here. The results are much faster and even faster than pytorch, and seem it's the fastest option in the vast majority of cases. So what is the point here?, well, that is even HuggingFace transformer is the largest NLP framework for TF/Keras and their engineers have very good TF skills, but they have only recently succeeded in using XLA. So I bet 95% TF users either have no idea how to use XLA properly or even don't know TF has this feature.
Huggingface TF transformer seems not the fastest implementation but this one. Again, have no idea how to maximize TF performance. Also see this thread
Sparsity is a cool feature if we want to optimize inference time further. See this Blog. Unlike in research, in industry, 20% faster is a massive improvement.
There are several frameworks that make TF model run faster such as iree and tf-trt. In my experience, they can speed up bert model up to 5 times faster.

Given all points above, I believe if this framework can support all cool features of TF2 including but not limited to mixed precision, xla, pruning, sparsity, tensorrt, irree ... then this framework will lead the market, at least those who do the product and are more prone to deployment. Also note that, the industry community is much larger than the research community.

cc @bhack @LukeWood @ianstenbit from keras-cv if you are interested.

4 replies

bhack Aug 28, 2022

For XLA we need to know at least the updated TF ops coverage in the bridges .

But nobody is interested to make some step ahead. Check tensorflow/tensorflow#56510

abheesht17 Aug 28, 2022
Collaborator

Can't speak for rest of the points yet, but a small note on text generation functions.

We are working on supporting XLA for text generation functions. In fact, all decoding functions except beam search currently support XLA, although we are yet to patch some additional changes. See this unit test, for instance: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/utils/text_generation_test.py#L148-L151.

bhack Aug 28, 2022

except beam search currently support XLA

Let me know when you have covered it as we are still maintaining a custom op for perf in Addons /cc @seanpmorgan
https://github.com/tensorflow/addons/tree/master/tensorflow_addons%2Fcustom_ops%2Fseq2seq%2Fcc%2Fops

mattdangerw Aug 29, 2022
Maintainer

Yeah, we are definitely working to make sure both XLA and mixed precision will be supported in KerasNLP! The work @abheesht17 has been doing for our generation is a good example.

mattdangerw · 2022-08-29T19:30:30Z

mattdangerw
Aug 29, 2022
Maintainer

Thanks for opening! I think this is a great callout!

We are just progressing to covering modeling workflows in KerasNLP, but I think what you are saying is spot on. We should really strive to have mixed-precision and XLA not just supported, but enabled as the default option wherever possible. They are both huge speedups.

For XLA, I believe that most components where we need support either have it, or there is work happening to support it (e.g. beam search utility).

For mixed precision, we do need to figure out how to instantiate say a BERT network with mixed precision along with checkpointed weights. Looks like @jbischof just opened a bug, so will kick off a comment there. #323

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixed precision and Tensorflow XLA #322

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 4 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Mixed precision and Tensorflow XLA #322

dathudeptrai Aug 27, 2022

Replies: 3 comments · 4 replies

jbischof Aug 28, 2022

dathudeptrai Aug 28, 2022 Author

bhack Aug 28, 2022

abheesht17 Aug 28, 2022 Collaborator

bhack Aug 28, 2022

mattdangerw Aug 29, 2022 Maintainer

mattdangerw Aug 29, 2022 Maintainer

dathudeptrai
Aug 27, 2022

Replies: 3 comments 4 replies

jbischof
Aug 28, 2022

dathudeptrai
Aug 28, 2022
Author

abheesht17 Aug 28, 2022
Collaborator

mattdangerw Aug 29, 2022
Maintainer

mattdangerw
Aug 29, 2022
Maintainer