Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement FusedEmbeddingSeqPoolGradKernel with cblas_saxpy #19770

Merged
merged 13 commits into from
Sep 17, 2019

Conversation

zhaify
Copy link
Contributor

@zhaify zhaify commented Sep 11, 2019

refer PR 19452

@zhaify zhaify force-pushed the fused_emb_seq_op_lg branch 2 times, most recently from 3f92d8c to 1a69c90 Compare September 11, 2019 21:50
@luotao1 luotao1 added the Intel label Sep 12, 2019
Copy link
Contributor

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luotao1 luotao1 merged commit 93c85c9 into PaddlePaddle:develop Sep 17, 2019
mapingshuo pushed a commit to mapingshuo/Paddle that referenced this pull request Sep 20, 2019
…dle#19770)

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* optimize bp with mkl sparse matrix
test=develop

* tmp add fused_emb_seq layer

* Add the support of padding_idx attribute.

test=develop

* add padding_idx support
test=develop

* implement grad refer lego
test=develop
@@ -151,7 +157,7 @@ class FusedEmbeddingSeqPoolKernel : public framework::OpKernel<T> {
auto csr_colmuns = csr_colmuns_t.mutable_data<int>(context.GetPlace());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请去掉临时Tensor,该Tensor全局有锁,可能会拖慢多线程的速度。

Tensor csr_vals_t, csr_colmuns_t, csr_row_idx_t;

最新速度请见 PaddlePaddle/benchmark#151 (comment)
可仿照 #21099 中的X_Temp_Out中间变量。

seiriosPlus pushed a commit to seiriosPlus/Paddle that referenced this pull request Dec 9, 2019
…dle#19770)

* Implement the operator with sprase matrix multiply

* Update the URL of mklml library.

test=develop

* Disable MKLML implematation when using no-linux.

test=develop

* optimize bp with mkl sparse matrix
test=develop

* tmp add fused_emb_seq layer

* Add the support of padding_idx attribute.

test=develop

* add padding_idx support
test=develop

* implement grad refer lego
test=develop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants