Reward Augmented Maximum Likelihood for Neural Structured Prediction #11

sotetsuk · 2017-04-17T09:54:33Z

sotetsuk · 2017-04-17T10:05:18Z

8/10

Seq2Seqなどを方策勾配法で最適化する研究はいくつかあるが、それらより圧倒的にシンプルな一方で理論的にも面白い。
アルゴリズムとしては編集距離に基づいてサンプル系列を生成し、それらをexp-scaledされた報酬（編集距離等）でISするという極めてヒューリスティックで単純なもの。
一方でそのヒューリスティックで単純な手法に対し、エントロピー正則化付きの方策勾配法との非凡な関係（KLが逆なだけ）を見出しているのは面白い。
エントロピー正則化付きの方策勾配法との関連でPCLなどにも通じる話になっている。

一方結局のところ機械翻訳ではBLEUで最適化は出来てなかったりはする（future workとされている）が、これは機械学習というよりNLPの話のように感じる。

スライドにまとめてみた:
https://docs.google.com/presentation/d/1P_ks8cqXcQmc8rBk7QlxcBHwfSdlNYnPmnWF0yj_nYs/edit#slide=id.g20593483e2_0_17

sotetsuk self-assigned this Apr 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reward Augmented Maximum Likelihood for Neural Structured Prediction #11

Reward Augmented Maximum Likelihood for Neural Structured Prediction #11

sotetsuk commented Apr 17, 2017

sotetsuk commented Apr 17, 2017

Reward Augmented Maximum Likelihood for Neural Structured Prediction #11

Reward Augmented Maximum Likelihood for Neural Structured Prediction #11

Comments

sotetsuk commented Apr 17, 2017

sotetsuk commented Apr 17, 2017