Read and write JSON lazy, especially json-arrays.
Handles both the JSON format:
[
{
"a": 1
},
{
"a": 2
}
]
As well as JSON LINES format:
{"a":1}
{"a": 2}
Also supports streaming from gzipped files.
Uses orjson
if present, otherwise standard json
.
# Using standard json
pip install json-arrays
# Using orjson
pip install json-arrays[orjson]
This library prefers files opened in binary mode.
Therefore does all dumps
-methods return bytes
.
All loads
methods handles str
, bytes
and bytesarray
arguments.
Allows you to use json.load
and json.dump
with
both json and json-lines files as well as dumping generators.
import json_arrays
# This command tries to guess format and opens the file
data = json_arrays.load_from_file("data.json") # or data.jsonl
# Write to file, again guessing format
json_arrays.dump_to_file(data, "data.jsonl")
from json_arrays import json_iter, jsonl_iter
# Open and read the file without guessing
data = json_iter.load_from_file("data.json")
# Process file
# Write to file without guessing
jsonl_iter.dump_to_file(data, "data.jsonl")
import json_arrays
def process(data):
for entry in data:
# process
yield entry
def read_process_and_write(filename_in, filename_out):
json_arrays.dump_to_file(
process(
json_arrays.load_from_file(filename_in)
),
filename_out
)
You can also use json_arrays as a sink, that you can send data to.
import json_arrays
with open("out.json", "bw") as fp:
# guessing format
with json_arrays.sink(fp) as sink:
for data in data_source():
sink.send(data)
This projects keeps a CHANGELOG.
This project uses pdm. After cloning the repo, just run
make dev
make test
to setup a virtual environment, install dev dependencies and run the unit tests.
Note: If you run the command in a activated virtual environment, that environment is used instead.
Push a tag in the format v\d+.\d+.\d+
to main
-branch, to build & publish package to PyPi.