Releases: svenkreiss/pysparkling
Releases · svenkreiss/pysparkling
v0.6.2
v0.6.1
testing continuous deployment
v0.6.0
- Broadcast, Accumulator and AccumulatorParam by @alexprengere
- support for increasing partition numbers in coalesce and repartition by @tools4origins
v0.5.0
- fixes for HDFS thanks to @tools4origins
- fix for empty partitions by @tools4origins
- api fixes by @artem0 and @tools4origins
- various updates for streaming submodule
- various updates to lint and test system
- logging: converted some info messages to debug
- ... documentation for some point releases is missing
v0.4.1
- retries for failed partitions
- improve
pysparkling.streaming.DStream
- updates to docs
v0.4.0
- major addition:
pysparkling.streaming
- updates to
RDD.sample()
- reorganized
scripts
andtests
- added
RDD.partitionBy()
- minor updates to
pysparkling.fileio
v0.3.23
small improvements to fileio and better documentation
v0.3.22
- reimplement
RDD.groupByKey()
- clean up of docstrings
v0.3.21
- faster text file reading by using
io.TextIOWrapper
for decoding
v0.3.20
* Google Storage file system (using ``gs://``)
* dependencies: ``requests`` and ``boto`` are not optional anymore
* ``aggregateByKey()`` and ``foldByKey()`` return RDDs
* Python 3: use ``sys.maxsize`` instead of ``sys.maxint``
* flake8 linting