- Online Learning: A Comprehensive Survey [arXiv]
- Visual Interpretability for Deep Learning: a Survey [arXiv]
- CNN Is All You Need [arXiv]
- The Matrix Calculus You Need For Deep Learning [arXiv]
- Neural Collaborative Filtering [arXiv]
- DensePose: Dense Human Pose Estimation In The Wild [arXiv] [article]
- Nested LSTMs [arXiv]
- Generating Wikipedia by Summarizing Long Sequences [arXiv]
- Fine-tuned Language Models for Text Classification [arXiv] [code]
- Deep Learning: An Introduction for Applied Mathematicians [arXiv]
- Hierarchical Representations for Efficient Architecture Search [arXiv]
- DroNet: Learning to Fly by Driving [UZH docs] [article] [code]
- Deep Learning: A Critical Appraisal [arXiv]
- The NarrativeQA Reading Comprehension Challenge [arXiv] [dataset]
- Mathematics of Deep Learning [arXiv]
- State-of-the-art Speech Recognition With Sequence-to-Sequence Models [arXiv] [article]
- Peephole: Predicting Network Performance Before Training [arXiv]
- Deliberation Network: Pushing the frontiers of neural machine translation [Research at Microsoft] [article]
- Deep Learning Scaling is Predictable, Empirically [arXiv] [article]
- Distilling a Neural Network Into a Soft Decision Tree [arXiv]
- Neural Text Generation: A Practical Guide [arXiv]
- CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning [arXiv] [article]
- Online Deep Learning: Learning Deep Neural Networks on the Fly [arXiv]
- Don't Decay the Learning Rate, Increase the Batch Size [arXiv]
- Dynamic Routing Between Capsules [arXiv]
- A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs [Science] [article] [code]
- Malware Detection by Eating a Whole EXE [arXiv] [article]
- Mastering the game of Go without Human Knowledge [Nature] [article]
- A systematic study of the class imbalance problem in convolutional neural networks [arXiv]
- Generalization in Deep Learning [arXiv]
- Swish: a Self-Gated Activation Function [arXiv]
- Generative Adversarial Networks: An Overview [arXiv]
- The IIT Bombay English-Hindi Parallel Corpus [arXiv] [article]
- Rainbow: Combining Improvements in Deep Reinforcement Learning [arXiv]
- Neural Color Transfer between Images [arXiv]
- StarSpace: Embed All The Things! [arXiv] [code]
- Deep Reinforcement Learning that Matters [arXiv] [code]
- WESPE: Weakly Supervised Photo Enhancer for Digital Cameras [arXiv] [article]
- A Brief Introduction to Machine Learning for Engineers [arXiv]
- A Deep Reinforcement Learning Chatbot [arXiv]
- Efficient Methods and Hardware for Deep Learning (Thesis) [Stanford Digital Repository]
- TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow [white paper] [code]
- Deep Learning for Video Game Playing [arXiv]
- Deep & Cross Network for Ad Click Predictions [arXiv]
- Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms [arXiv] [code]
- On the Effectiveness of Visible Watermarks [CVPR] [article]
- On Ensuring that Intelligent Machines Are Well-Behaved [arXiv]
- Training Deep AutoEncoders for Collaborative Filtering [arXiv] [code]
- DARLA: Improving Zero-Shot Transfer in Reinforcement Learning [arXiv]
- Eyemotion: Classifying facial expressions in VR using eye-tracking cameras [arXiv] [article]
- Optimizing the Latent Space of Generative Networks [arXiv]
- Learning Transferable Architectures for Scalable Image Recognition [arXiv]
- Automatic Recognition of Deceptive Facial Expressions of Emotion [arXiv]
- Revisiting Unreasonable Effectiveness of Data in Deep Learning Era [arXiv] [article]
- Do GANs actually learn the distribution? An empirical study [arXiv]
- Deep Interest Network for Click-Through Rate Prediction [arXiv]
- One Model To Learn Them All [arXiv] [code] [article]
- Deal or No Deal? End-to-End Learning for Negotiation Dialogues [S3AWS] [code] [article]
- Attention Is All You Need [arXiv] [code] [article]
- Sobolev Training for Neural Networks [arXiv]
- Forward Thinking: Building and Training Neural Networks One Layer at a Time [arXiv]
- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour [arXiv]
- pix2code: Generating Code from a Graphical User Interface Screenshot [arXiv] [article] [code]
- Thinking Fast and Slow with Deep Learning and Tree Search [arXiv]
- Look, Listen and Learn [arXiv]
- Real-Time Adaptive Image Compression [arXiv]
- Learning to Skim Text [arXiv]
- Get To The Point: Summarization with Pointer-Generator Networks [arXiv] [code] [article]
- DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks [arXiv] [article] [code]
- Best Practices for Applying Deep Learning to Novel Applications [arXiv]
- Controllable Text Generation [arXiv]
- Evolving Deep Neural Networks [arXiv]
- Opening the Black Box of Deep Neural Networks via Information [arXiv] [video]
- Learning to Optimize Neural Nets [arXiv] [article]
- The Shattered Gradients Problem: If resnets are the answer, then what is the question? [arXiv]
- Deep Voice: Real-time Neural Text-to-Speech [arXiv]
- Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning [arXiv]
- The Game Imitation: Deep Supervised Convolutional Networks for Quick Video Game AI [arXiv]
- All-but-the-Top: Simple and Effective Postprocessing for Word Representations [arXiv]
- DyNet: The Dynamic Neural Network Toolkit [arXiv]
- Quantum Machine Learning [arXiv]
- Understanding deep learning requires rethinking generalization [arXiv]
- Can Active Memory Replace Attention? [arXiv]
- Xception: Deep Learning with Depthwise Separable Convolutions [arXiv]
- Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation [arXiv]
- Semi-Supervised Classification with Graph Convolutional Networks [arXiv]
- Deep Neural Networks for YouTube Recommendations [paper]
- Neural Machine Translation with Recurrent Attention Modeling [arXiv]
- Recurrent Highway Networks [arXiv]
- Bag of Tricks for Efficient Text Classification [arXiv]
- Context-Dependent Word Representation for Neural Machine Translation [arXiv]
- Concrete Problems in AI Safety [arXiv]
- Smart Reply: Automated Response Suggestion for Email [arXiv]
- Very Deep Convolutional Networks for Natural Language Processing [arXiv]
- TensorFlow: A system for large-scale machine learning [arXiv]
- Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles [arXiv]
- A Persona-Based Neural Conversation Model [arXiv]
- Natural Language Processing (almost) from Scratch [arXiv] - 2011