Skip to content

Latest commit

 

History

History
16 lines (12 loc) · 427 Bytes

layer_normalization.md

File metadata and controls

16 lines (12 loc) · 427 Bytes

Layer Normalization

  • published in 2016. 7
  • Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

Simple summary

  • Similar to batch normalization.
  • Works well for recurrent networks with mini-batch.
  • Independent with batch size.
  • Robust to Input data's scale
  • Robust to Weight matrix's scale and shift
  • Naturally uploaded scales decreased as learning progresses
  • but, Batch Norm still perform better for CNNs.