MultiHeadAttention._compute_attention_mask()
always returns a bool …
#35
Job | Run time |
---|---|
2m 32s | |
15m 14s | |
22m 43s | |
13m 50s | |
5m 56s | |
1h 0m 15s |