Autonomio provides a very high level abstraction layer for rapidly testing research ideas and instantly creating neural network based decision making models. Autonomio is built on top of Keras, using Tensorflow as a backend and spaCy for word vectorization. Autonomio brings deep learning and state-of-the-art linguistic processing accessible to anyone with basic computer skills.
This document focus on a high-level overview of Autonomio's capabilities. If you're looking for the User Manual, you may want to look read the docs instead.
- intuitive single-command neural network training
- training command accepts as little as 'x' and 'y' as inputs
- 'x' can be text, continues or categorial data
- even 'y' and 'x' both as text yields a succesful result
- 15 optional configurations from a single command
- seamlessly integrates word2vec with keras deep learning
- interactive plots specifically designed for deep learning model evaluation
For most use cases succesfully running a state-of-the-art neural network works out of the box with zero configuration yielding a model that can be used to predict outcomes later.
For first time use:
python -m spacy download en
Open a jupyter notebook (or python console) and type:
from autonomio.commands import *
%matplotlib inline
train(x,y,data,labels)
Even if 'x' is unstructured/text, this command will yield a functional neural network trained to predict the 'y' variable. 'y' can be continuous, categorical or binary. The model can be saved and then used to predict other data without training:
test(x,data)
This will yield a pandas dataframe with the values and whichever label you were connecting the value with.
To train a neural network, and then use it for making a prediction on a different dataset is almost as easy as the first example. This time let's also introduce some of the command parameters available for 'train' function.
train('text','quality_score',
tweets.head(3000),
epoch=10,
dropout=.5,
flatten=.3,
save_model=True,
verbose=0)
Instead of the default 5 epochs, we're setting epoch to 10 and increase dropout rate between layers to 50%. Also instead of using the default flattening (transforming y feature to 0 and 1), we take only the bottom 30% in the inter quartile range.
Autonomio have been tested in various Ubuntu and Mac system with success using the provided setup scripts.
You need a machine with at least 4gb of memory if you want to do text processing, and othewrise 2gb is totally fine and 1gb might be ok. Actually very low spec AWS instance runs Autonomio just fine.
For research and production envrionments we recommend one server with at least 4gb memory as a 'work station' and a separate insatance with high-end CUDA supported GPU. The GPU instance costs roughly $1 per hour, and can be shut down when not used.
As setting up the GPU station from ground can be a bit of a headache, we recommend using the AWS Machine Learning AMI to get setup quickly.
You probably want to use the setup_ubuntu.sh script to automate the process of setting up in a new system.
Here is the list of linux commands to take care of all the depencies.
Up until today most of linguistic technologies, not to mention deep learning, have not been accessible in the way that they would allow seamless workflow that supports the need of even less computer savvy researchers. Yet the modern researcher can benefit significantly from unlocking the value in unstructured data, and there are by some estimates 9 times more of unstructured data than structured. Autonomio combines two cutting edge AI technologies - word vectorizing and deep learning - in to one intuitive tool researchers from wide range of backgrounds can benefit from.
Because of the excellent out-of-the-box neural network performance provided by Keras, Autonomio users get state-of-the-art prediction capability with minimal setting configuration. Autonomio has been tested extensively and consistently provides in a single line of code the same result that you would get from Keras with 10 lines of code. Using real data involving bilions of dollars in advertising spend, we've proven autonomio to be cabably of yielding outstanding results in previously unsolved problems such as better than human classifier result for niche website category classification.
Artificial Intelligence and the signals intelligence method should be accessible to all researchers. Autonomio allows total non-programmers in most cases to easily create advanced neural networks without any data preparation through an easy to memorize single command interface. Autonomio has two commands:
train(x,y,data)
and
test(x,data)
An example of Autonomio's usability factor is how x can be ustructured data, as is the case in an increasing number of challenges research phase in the digital age.
Autonomio uses a novel way of processing unstructured data,
- pre-process text
- use spaCy to vectorize the text
- create 300 invididual features from the vector
- use the features as a signal in a Keras model
Autonomio's vectorizing engine spaCy supports currently 13 languages:
- English
- German
- Chinese
- Spanish
- Italian
- French
- Portuguese
- Dutch
- Swedish
- Finnish
- Hungarian
- Bengali
- Hebrew
NOTE: the spacy language libraries have to be downloaded each separately.
spaCy makes it reletively streamlined to create support for any language and the challenge can (and should be) approached iteratively.
Autonomio have been tested in several Mac OSX and Ubuntu environments (both server and desktop).