This folder contains machine learning models implemented by researchers in TensorFlow.
The research models are maintained by their respective authors.
Note: Some research models are stale and have not updated to the latest TensorFlow 2 yet.
Folder | Framework | Description | Maintainer(s) |
---|---|---|---|
object_detection | TensorFlow Object Detection API | A framework that makes it easy to construct, train and deploy object detection models |
jch1, tombstone, derekjchow, jesu9, dreamdragon, pkulzc |
slim | TensorFlow-Slim Image Classification Model Library | A lightweight high-level API of TensorFlow for defining, training and evaluating image classification models • Inception V1/V2/V3/V4 • Inception-ResNet-v2 • ResNet V1/V2 • VGG 16/19 • MobileNet V1/V2/V3 • NASNet-A_Mobile/Large • PNASNet-5_Large/Mobile |
sguada, nathansilberman |
Folder | Paper(s) | Description | Maintainer(s) |
---|---|---|---|
adv_imagenet _models |
[1] Adversarial Machine Learning at Scale [2] Ensemble Adversarial Training: Attacks and Defenses |
Adversarially trained ImageNet models | alexeykurakin |
adversarial_crypto | Learning to Protect Communications with Adversarial Neural Cryptography | Code to train encoder/decoder/adversary network triplets and evaluate their effectiveness on randomly generated input and key pairs | dave-andersen |
adversarial _logit_pairing |
Adversarial Logit Pairing | Implementation of Adversarial logit pairing paper as well as few models pre-trained on ImageNet and Tiny ImageNet | alexeykurakin |
adversarial_text | [1] Adversarial Training Methods for Semi-Supervised Text Classification [2] Semi-supervised Sequence Learning |
Adversarial Training Methods for Semi-Supervised Text Classification | rsepassi, a-dai |
attention_ocr | Attention-based Extraction of Structured Information from Street View Imagery | xavigibert | |
audioset | Models for AudioSet: A Large Scale Dataset of Audio Events | plakal, dpwe | |
autoaugment | [1] AutoAugment [2] Wide Residual Networks [3] Shake-Shake regularization [4] ShakeDrop Regularization for Deep Residual Learning |
Train Wide-ResNet, Shake-Shake and ShakeDrop models on CIFAR-10 and CIFAR-100 dataset with AutoAugment | barretzoph |
autoencoder | Various autoencoders | snurkabill | |
brain_coder | Neural Program Synthesis with Priority Queue Training | Program synthesis with reinforcement learning | danabo |
cognitive_mapping _and_planning |
Cognitive Mapping and Planning for Visual Navigation | Implementation of a spatial memory based mapping and planning architecture for visual navigation | s-gupta |
compression | Full Resolution Image Compression with Recurrent Neural Networks | nmjohn | |
cvt_text | Semi-supervised sequence learning with cross-view training | clarkkev, lmthang | |
deep_contextual _bandits |
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling | rikel | |
deep_speech | Deep Speech 2 | End-to-End Speech Recognition in English and Mandarin | |
deeplab | [1] DeepLabv1 [2] DeepLabv2 [3] DeepLabv3 [4] DeepLabv3+ |
DeepLab models for semantic image segmentation | aquariusjay, yknzhu, gpapan |
delf | [1] Large-Scale Image Retrieval with Attentive Deep Local Features [2] Detect-to-Retrieve |
DELF: DEep Local Features | andrefaraujo |
domain_adaptation | [1] Domain Separation Networks [2] Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks |
Code used for two domain adaptation papers | bousmalis, dmrd |
efficient-hrl | [1] Data-Efficient Hierarchical Reinforcement Learning [2] Near-Optimal Representation Learning for Hierarchical Reinforcement Learning |
Code for performing hierarchical reinforcement learning | ofirnachum |
feelvos | FEELVOS | Fast End-to-End Embedding Learning for Video Object Segmentation | |
fivo | Filtering variational objectives for training generative sequence models | dieterichlawson | |
global_objectives | Scalable Learning of Non-Decomposable Objectives | TensorFlow loss functions that optimize directly for a variety of objectives including AUC, recall at precision, and more | mackeya-google |
im2txt | Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge | Image-to-text neural network for image captioning | cshallue |
inception | Rethinking the Inception Architecture for Computer Vision | Deep convolutional networks for computer vision | shlens, vincentvanhoucke |
keypointnet | KeypointNet | Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning | mnorouzi |
learned_optimizer | Learned Optimizers that Scale and Generalize | olganw, nirum | |
learning_to _remember _rare_events |
Learning to Remember Rare Events | A large-scale life-long memory module for use in deep learning | lukaszkaiser, ofirnachum |
learning _unsupervised _learning |
Meta-Learning Update Rules for Unsupervised Representation Learning | A meta-learned unsupervised learning update rule | lukemetz, nirum |
lexnet_nc | LexNET | Noun Compound Relation Classification | vered1986, waterson |
lfads | LFADS - Latent Factor Analysis via Dynamical Systems | Sequential variational autoencoder for analyzing neuroscience data | jazcollins, sussillo |
lm_1b | Exploring the Limits of Language Modeling | Language modeling on the one billion word benchmark | oriolvinyals, panyx0718 |
lm_commonsense | A Simple Method for Commonsense Reasoning | Commonsense reasoning using language models | thtrieu |
lstm_object_detection | Mobile Video Object Detection with Temporally-Aware Feature Maps | dreamdragon, masonliuw, yinxiaoli, yongzhe2160 | |
marco | Classification of crystallization outcomes using deep convolutional neural networks | vincentvanhoucke | |
maskgan | MaskGAN: Better Text Generation via Filling in the______ | Text generation with GANs | liamb315, a-dai |
namignizer | Namignizer | Recognize and generate names | knathanieltucker |
neural_gpu | Neural GPUs Learn Algorithms | Highly parallel neural computer | lukaszkaiser |
neural_programmer | Learning a Natural Language Interface with Neural Programmer | Neural network augmented with logic and mathematic operations | arvind2505 |
next_frame _prediction |
Visual Dynamics | Probabilistic Future Frame Synthesis via Cross Convolutional Networks | panyx0718 |
pcl_rl | [1] Improving Policy Gradient by Exploring Under-appreciated Rewards [2] Bridging the Gap Between Value and Policy Based Reinforcement Learning [3] Trust-PCL: An Off-Policy Trust Region Method for Continuous Control |
Code for several reinforcement learning algorithms | ofirnachum |
ptn | Perspective Transformer Nets | Learning Single-View 3D Object Reconstruction without 3D Supervision | xcyan, arkanath, hellojas, honglaklee |
qa_kg | Learning to Reason | End-to-End Module Networks for Visual Question Answering | yuyuz |
real_nvp | Density estimation using Real NVP | laurent-dinh | |
rebar | REBAR | Low-variance, unbiased gradient estimates for discrete latent variable models | gjtucker |
sentiment _analysis |
Effective Use of Word Order for Text Categorization with Convolutional Neural Networks | A simple model to classify a document's sentiment | sculd |
seq2species | Seq2Species: A deep learning approach to pattern recognition for short DNA sequences | Neural Network Models for Species Classification | apbusia, depristo |
skip_thoughts | Skip-Thought Vectors | Recurrent neural network sentence-to-vector encoder | cshallue |
steve | Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion | A hybrid model-based/model-free reinforcement learning algorithm for sample-efficient continuous control | buckman-google |
street | End-to-End Interpretation of the French Street Name Signs Dataset | Identify the name of a street (in France) from an image using a Deep RNN | theraysmith |
struct2depth | Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos | Unsupervised learning of depth and ego-motion | aneliaangelova |
swivel | Swivel: Improving Embeddings by Noticing What's Missing | The Swivel algorithm for generating word embeddings | waterson |
tcn | Time-Contrastive Networks: Self-Supervised Learning from Video | Self-supervised representation learning from multi-view video | coreylynch, sermanet |
textsum | Sequence-to-sequence with attention model for text summarization | panyx0718, peterjliu | |
transformer | Spatial Transformer Network | Spatial transformer network that allows the spatial manipulation of data within the network | daviddao |
vid2depth | Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints | Learning depth and ego-motion unsupervised from raw monocular video | rezama |
video _prediction |
Unsupervised Learning for Physical Interaction through Video Prediction | Predicting future video frames with neural advection | cbfinn |
If you want to contribute a new model, please submit a pull request.