Skip to content

Commit

Permalink
Merge pull request #11 from superconductive/graphql_auth
Browse files Browse the repository at this point in the history
Graphql auth
  • Loading branch information
abegong authored Apr 19, 2018
2 parents e73483c + 0d2c967 commit 2c061e6
Show file tree
Hide file tree
Showing 5 changed files with 841 additions and 603 deletions.
105 changes: 87 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,107 @@
# pair

Cooper Pair is a Python library to simplify programmatic access to the DQM
GraphQL API. It is primarily intended to help us dogfood DQM services and
integrate DQM into our contract workflows (both Jupyter-based and batch),
and secondarily (and prospectively) as a resource for clients who want to write
their own code against DQM directly (perhaps as part of an Airflow graph).
`cooper_pair` is a Python library that provides programmatic access to Superconductive's GraphQL API.

It supports a limited number of common use cases. (See below.)
`cooper_pair` is *not* intended as a general-purpose integration library for GraphQL.
Most useful GraphQL queries are *not* supported within the `cooper_pair` API.

## Why limit the use cases?

GraphQL is a composable query language. The space of allowed queries is enormous, and
developers are empowered to choose the right query for a given job. This de-couples development
behind the API from development that consumes the API, and allows each to move faster,
independently.

Wrapping a flexible GraphQL API in a rigid python library would completely defeat that purpose.

Instead, think of `cooper_pair` as training wheels. It makes it easy to quickly connect
to GraphQL, and perform a few common functions. It also provides a collection of example
queries to learn how to use GraphQL and the Allotrope API.

In other words, `cooper-pair` can help you get started, but you will be able to get far more
out of Allotrope once you learn to query it natively using GraphQL.

## Installation

cd cooper-pair
pip install .

Or,

pip install git+ssh://[email protected]/superconductive/cooper.git#egg=cooper_pair&subdirectory=pair

## Usage

### Instantiating the API
### Instantiate the API

from cooper_pair import CooperPair
pair = CooperPair()

### Creating a new checkpoint
pair = CooperPair(
graphql_endpoint="http://my-data-valet-url:3010/graphql",
email='my_user@some_email.com',
password='my_very_secure_password'
)

### List datasets

### Adding a new dataset
response = pair.list_datasets()
print( json.dumps(response, indent=2))

### Get a dataset
response = pair.get_dataset("RGF2YXNldPoxODl=")
print( json.dumps(response, indent=2))

### List checkpoints

response = pair.list_checkpoints()
print( json.dumps(response, indent=2) )

### Create a new dataset and evaluate it against an existing checkpoint

From a dataframe:

my_df = pd.DataFrame({
"x" : [1,2,3,4,5],
"y" : [6,7,8,9,10],
})
response = pair.evaluate_checkpoint_on_pandas_df(
checkpoint_id="Q2hlY2twb2ludDox",
pandas_df=my_df,
filename="my_dataframe_name"
)
evaluation_id = response['addEvaluation']['evaluation']['id']
dataset_id = response['addEvaluation']['evaluation']['dataset']['id']

From a file:

with open(filename, 'rb') as fd:
dataset = pair.add_dataset_from_file(
fd, project_id=project_id, created_by_id=created_by_id)
dataset_id = dataset['dataset']['id']
with open('my_file.csv', 'rb') as fd:
dataset = pair.evaluate_checkpoint_on_file(
checkpoint_id="Q2hlY2twb2ludDox",
fd=fd,
)
evaluation_id = response['addEvaluation']['evaluation']['id']
dataset_id = response['addEvaluation']['evaluation']['dataset']['id']

Note: Evaluation is asynchronous. When the response first comes back from Allotrope,
it will have `status="created"`. This will change to `pending` when a worker picks it up,
then to `success` or `failed` depending on the result of the evaluation.

### Creating a new checkpoint by autoinspection
You can query for status as follows:

response = pair.query("""
query evaluationQuery($id: ID!) {
evaluation(id: $id) {
id,
status
}
}
""",
variables={
'id' : evaluation_id
})
print(response)

checkpoint = pair.add_checkpoint(checkpoint_name, autoinspect=True, dataset_id=dataset_id)
checkpoint_id = checkpoint['addCheckpoint']['checkpoint']['id']

### Creating a new checkpoint from JSON

import json
Expand All @@ -42,4 +111,4 @@ From a file:
pair.add_checkpoint_from_expectations_config(
checkpoint_config, "Checkpoint Name")

### Evaluating a checkpoint on a dataset

Loading

0 comments on commit 2c061e6

Please sign in to comment.