Skip to content

Latest commit

 

History

History

ingest-python-sample

Dataframe / Arrow Ingestion Example

This example shows how to ingest dataframes in arrow format into Dozer and readily serve APIs.

Dependencies

Initialization

Refer to Installation for instructions.

Download the sample dataset from NYC - TLC Trip Record Data.

./init.sh

You can refer to the full example in this notebook

Run Dozer in one terminal

dozer

Ingest data

import polars as pl
from pydozer.ingest import IngestClient

df = pl.read_parquet('data/trips/fhvhv_tripdata_2022-01.parquet')
ingest_client = IngestClient(url="localhost:7005")
small = df.head(1000)
ingest_client.ingest_df_arrow("trips", small)

Sample Queries

from pydozer.api import ApiClient

api_client = ApiClient("trips", url="localhost:7003")
# Get Record Count
trips_count = api_client.count()
print(trips_count)

# Query with $limit, $filter and $order_by
trips = api_client.query({'$limit': 1})
if trips and trips.records:
  print(trips.records[0])
else:
  print("No records found")