Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's next for the state of the art? #1

Open
LifeIsStrange opened this issue Jun 25, 2020 · 2 comments
Open

What's next for the state of the art? #1

LifeIsStrange opened this issue Jun 25, 2020 · 2 comments

Comments

@LifeIsStrange
Copy link

Firstly I would like to thank you for this fantastic work!

I am not an expert, I am more of a user of dependency parsing than a researcher but I NEED (I try to build true semantic parsing) accurate dependency parsing.
As you know, the number 1 SOTA:
Label Attention Layer + HPSG + XLNet (Mrini et al., 2019) has a LAS of 96.26.
While this is correct, it is not accurate enough for many semantic downstream tasks !

So I'm looking for the future state of the art, what do you think would be the most promising?

I'm really into merging the best ideas from others SOTA into a new state of the art, being the best of all.
But some techniques are incompatible with others.

So let me ask some noob questions:
Could your crfparser benefit from using XLnet? From using HPSG? And/or from using a label attention layer?

@LifeIsStrange
Copy link
Author

LifeIsStrange commented Jun 25, 2020

Actually I am the one that suggested the HPSG paper to experiment with XLnet instead of BERT (and it gave accuracy gains).
I suggested to him two other followup experiments but he never took the time to do it.
So let me share them with you:

  1. using the state of the art activation function Mish can give high accuracy gains!
    https://github.com/digantamisra98/Mish
  2. using the state of the art meta optimizer
    https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer/issues

Ranger achieve huge accuracy gains on computer vision tasks but sadly, culturally almost no NLP researcher use it (or is aware of its existence)
So it might need some fine tuning for transformers e.g maybe that gradient centralization will need to be disabled (or maybe it will be the contrary)
related lessw2020/Ranger-Deep-Learning-Optimizer#13
But I do believe that the first researcher that will fine tune Ranger for Nlp tasks / transformers will be able to improve the SOTA on many tasks for free.

@yzhangcs

@yzhangcs
Copy link
Owner

Hi, thanks for your suggestions.
Intuitively crfpar may benefit from PLMs like XLNet. But I didn't conduct the experiments on them. I will let you know if the experiments are completed. I have tried the joint framework of dependency and constituency like HPSG on crfpar but found very little gains.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants