Self-Normalizing Neural Networks, Günter Klambauer, Thomas Unterthiner, Andreas Mayr, Sepp Hochreiter
- Improving Generalization Performance by Switching from Adam to SGD, Nitish Shirish Keskar, Richard Socher
- Can you Trust the Trend: Discovering Simpson's Paradoxes in Social Data, Nazanin Alipourfard, Peter G. Fennell, Kristina Lerman
- Word Translation Without Parallel Data, Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou