Lightweight implementation of Google's Protocol Buffers in Python.
In benchmarks, the protolite encoder ran twice as fast as
Google's. Using Python's timeit
module, the same data for both APIs was
encoded and decoded 10000 times. The lowest time of three attempts was
picked for each:
protobuf: 3.6064529418945312 seconds protolite: 1.7224960327148438 seconds
If we take the ratio of these two times we see that protolite was about two times faster than its counterpart.
Similarly, using Pypy we get about twice the speed:
protobuf: 0.807873010635376 seconds protolite: 0.4414529800415039 seconds
The benchmark
directory in the github repository contains the files
needed to re-run the tests . In addition, you will need the protobuf
Python library. Try it on your platform, but, keep your machine as quite
as possible so as to not skew the results:
PYTHONPATH=$PYTHONPATH:$(pwd) python benchmark/benchmark.py
Pass the --pypy flag if you want to use Pypy in order to warm up the Pypy JIT compiler and get a more accurate result:
PYTHONPATH=$PYTHONPATH:$(pwd) pypy benchmark/benchmark.py --pypy
You can also make changes to the benchmark/messages.proto
file to create
your own tests. You'll need to re-compile the messages.py
and
messages_pb2.py
files in the benchmark
directory afterwards by running
the make
command inside the same directory. Of course, you will need protoc
to compile Google's version.
Protocol Buffers (protobuf) is a data interchange format created by Google. protolite is a rewrite of its encoder and file generator specifically created and optimized for Python. The encoder is optimized for speed taking the language's properties in mind. The generator aims to provide ease-of-use and compatibility with the language. For example, messages are implemented using only dicts. Familiarity with protobuf is required in order to use protolite effectively.
You can download and install protolite from pypi with pip:
pip install python-protolite
Alternatively, you can clone the repository containing the source code from github and install protolite via setuptools:
git clone https://github.com/thelinuxkid/python-protolite.git cd python-protolite python setup.py install
protolite comes with a utility that generates Python files structured
for efficiency and readbility. After the installation you will have an
executable file called python-protolitec
. Its most simple usage
takes two positional arguments. The first is a list of the protobuf
definition files and the second a directory where to write the Python
version of those files:
python-protolitec proto/*.proto python
The output files will retain the same file name as the source; only the
extension will be changed. For example, the file proto/messages.proto
will produce the file python/messages.py
. You can use the --help
flag to view the other options offered by python-protolitec
.
Let's say you have a protobuf file called messages.proto
containing:
message User { optional uint32 userID = 1; enum UserType { STANDARD = 0; ADMIN = 1; } optional UserType type = 2; }
python-protolite
will create a Python module messages
with a user
object which has a decode
and an encode
method. To encode a
message you would do something like:
import messages msg_enc = {'user_id': 123, 'type': messages.user_type.STANDARD} data = messages.user.encode(msg_enc)
As you can see, python-protolite
changes camel-case variable names to
underscore. On the other end, to decode the message you would do
something similar:
import messages msg_dec = messages.user.decode(data)
The variable msg_dec will be equal to msg_enc.
The message objects also contain a pretty print method. Calling
message.user.pprint(msg_enc)
would produce:
{ "type": "STANDARD", "user_id": 123 }
You can pass the keyword argument stream
to pprint
to write to
a stream different than sys.stdout.
If you download the source code from github you will see a
grammar
directory at the root level. This directory contains all
the files used to create the parser and lexer in protolite.parser
,
the module used by python-protolitec
to parse the protobuf
definition files. If you are familiar with Antlr you can edit the
proto_lexer.g
and proto_parser.g
files in this directory to create a
new Python parser and/or lexer using the Antlr jar in the same directory:
cd grammar java -jar antlr-3.1.3.jar -fo . proto_lexer.g java -jar antlr-3.1.3.jar -fo . proto_parser.g
This will create four files: proto_lexer.py
,
proto_lexer.tokens
, proto_parser.py
and
proto_parser.tokens
. You can leave the *.tokens files where they
are but move the *.py files to protolite/parser to use your new parser
with python-protolitec
. If you want to use a different version of
Antlr do so at your own risk. You will likely need the new Antlr
version to match the Python runtime version in setup.py.