Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create demo that outlines how to import their own data from CSV into the ENTR warehouse and run the example OpenOA notebooks from the data. #30

Open
jordanperr opened this issue Apr 19, 2022 · 4 comments
Assignees

Comments

@jordanperr
Copy link
Contributor

jordanperr commented Apr 19, 2022

Definition of Done

Done when we have the workflow documented and tested for how to do a trial run of ENTR on the runtime image using CSV data stakeholders can upload.

Problem/Context

A demonstration of the ENTR modeling guidelines all the way through to seeing analytics will make this "real" for interested stakeholders.

User Story/Stories

As a an interested stakeholder I want a sandbox demonstration of ENTR without running it on my production infrastructure (i.e. without risk) so that I can understand its value; synthesize how ENTR could be deployed in production at my company beyond just the docs; and have artifacts to help me, my peers, and management make a decision on whether to adopt it or not.

Design/Solution Ideas

  • need to define the use cases to demo so we can clearly define the data requirements

Discovery Work Outline/Notes

  • [ ]
@ejsimley
Copy link
Contributor

Ideas from 4/12/2022

  • Use Spark to read in csv files
  • User specifies column mapping to ENTR column names
  • dbt converts wide tables to ~4 column tables in standard ENTR format

@ejsimley
Copy link
Contributor

ejsimley commented Apr 25, 2022

Reanalysis downloading notes from 3/28/2022 meeting:

  • The user will be responsible for downloading reanalysis data as csv files from PlanetOS using the OpenOA toolkit (or another method if they prefer) when importing their operational data to the warehouse rather than when OpenOA is used when analyzing data
  • We can write a script to create a warehouse table from the downloaded csv files
  • We can provide documentation and an example showing how to download reanalysis data using the OpenOA PlanetOS toolkit

Reanalysis data importing notes from 4/4/2022

  • User will download reanalysis files and drop into specific folder. Or automate by looping through plant table and downloading files for all plants
  • dbt workflow to add downloaded csv files to seed table
  • Eric to send Lewis example python code to download reanalysis data using OpenOA
  • For now, we'll create a separate table for each reanalysis product

@lewisarmistead
Copy link
Collaborator

Examples that we would need to support: CP's data from Zenodo
https://zenodo.org/record/5946808
https://zenodo.org/record/5841834

@ejsimley
Copy link
Contributor

To avoid memory issues with dbt seed, we decided to try reading from csv using SQL when materializing warehouse tables

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants