Skip to content

Latest commit

 

History

History
41 lines (33 loc) · 1.45 KB

04-Statistical_machine_translation.md

File metadata and controls

41 lines (33 loc) · 1.45 KB

Statistical machine translation

Neural Machine translation:

  • Open NMT
  • Started to be statistical since 1990.
  • opus.nlpl.eu

Challenges:

  • Multiword extensions
  • Words get removed, and added.

Noisy channel model:

  • Gets distorted and we get back the information (e.g. airplane radio).
  • Get a sentence in a language, decode and encode.

Output depends probabilistically on input:

  • Given a French sentence F, search for ENglish sentence E* that maximises P(E | F).
  • E* = argmax_E P(E|F), by Bayes' Rule: P(E) | P(F|E) = P(E) * P(F|E)
  • P(F|E) is the traithfulness; P(E) is fluency (so not just to map word to word but makes sense in English the final sentence).
  • P(F|E) is the translation model (TM), and P(E) is the language model (LM) can be trained separatelly.
  • The argmax part (decoder): search algorithm to find E*.

Needed:

  • Sequence-aligned corpus.
  • Can we estimate P(F|E) from entire sentences? Not really if the thing is in the training data. So we have to do a workaround...
  • Break into words. Learn translation by word aligning a sentence-algined corpus.

Word alignment:

  • Can be used to generate correspondence among sentences.
  • Expectation mazimization.
  • Using n-gram.
  • Compute n-grams as always with counting
  • Important smoothing.
  • Problem with extremally long words.

Phrase pairs:

  • For example: of thedel (Spanish).
  • Expand blocks.
  • What is more important: P(F|E) or P(E).
  • Depends. For a physician, P(F|E). For a novel, P(E).