Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up warnings for 2023 data pipeline #400

Merged
merged 11 commits into from
Dec 13, 2024
Merged

Conversation

grgmiller
Copy link
Collaborator

Purpose

This PR addresses several warnings that were being raised in the 2023 data pipeline:

  • When flagging differences between the inputs and outputs of the EIA-923 allocation process, update threshold to only warn if differences are greater than 0.01% to avoid warnings for rounding errors.
  • Add a multi-year backstop for sulfur content so that we have complete sulfur content data for JF fuels
  • Drop bad generators from the EIA-923 pipeline that were introduced by Incorrect generator_ids in 2023 data catalyst-cooperative/pudl#3987
  • Fix issue with missing fuel data for a plant that had recently retired, so wasn't appearing in the primary fuel table even though it was still reporting CEMS data
  • Flag and drop CEMS units that report steam-only output and are not mapped to an EIA ID. These are likely for district steam systems and not power generation.
  • Add the units flagged as "Manual CAMD Excluded" in the PSDC to the list of non-grid-connected units to remove.

I pulled in the manual mappings from the other PR that were not related to steam-only units without a clear mapping.

Advances CAR-4691

Testing

Running pipeline for 2023
Will run for other years as well

Usage Example/Visuals

Example of warning about CEMS units missing EIA generator mappings (these are actually handled by our manual fuel category reference table since they do not report to EIA):
image

Example of steam units being removed:
image

Review estimate

20 minutes

Future work

We are still missing some records in the plant attribute table:
image

There are still some mis-allocations happening. Something that probably needs to be investigated in PUDL sometime in the future:
image

Checklist

  • Update the documentation to reflect changes made in this PR
  • Format all updated python files using black
  • Clear outputs from all notebooks modified
  • Add docstrings and type hints to any new functions created

@grgmiller grgmiller requested a review from rouille December 8, 2024 02:08
@rouille
Copy link
Collaborator

rouille commented Dec 8, 2024

The pipeline finished successfully. That sais in the monthly an annual power sector data, there are entries with missing fuel category. For these rows all columns are 0, e.g, see annual MISO power sector data below:
MISO.csv

Should we just drop these entries when writing the files?

I comes from these plants:

2024-12-07 22:08:08 [WARNING] oge.oge.helpers:334                                                 net_generation_mwh  fuel_consumed_for_electricity_mmbtu
plant_id_eia subplant_id ba_code fuel_category                                                         
2832         2           PJM     <NA>                          0.0                                  0.0
4014         5           MISO    <NA>                          0.0                                  0.0
10698        1           MISO    <NA>                          0.0                                  0.0
50282        1           PJM     <NA>                          0.0                                  0.0
50852        1           PJM     <NA>                          0.0                                  0.0
60910        1           ERCO    <NA>                          0.0                                  0.0

@grgmiller
Copy link
Collaborator Author

The pipeline finished successfully. That sais in the monthly an annual power sector data, there are entries with missing fuel category. For these rows all columns are 0, e.g, see annual MISO power sector data below:
MISO.csv

I'm not seeing any missing fuel categories in my files that I got from running my pipeline yesterday. I think I must have fixed this already with one of the last commits maybe?

@grgmiller grgmiller requested a review from rouille December 13, 2024 18:28
Copy link
Collaborator

@rouille rouille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@rouille rouille assigned rouille and grgmiller and unassigned rouille and grgmiller Dec 13, 2024
@rouille
Copy link
Collaborator

rouille commented Dec 13, 2024

Pipeline stopped in Step 4 with commit 48b4a0f

2024-12-13 11:21:36 [INFO] oge.data_pipeline:209 4. Cleaning CEMS data
Traceback (most recent call last):
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_pipeline.py", line 704, in <module>
    main(sys.argv[1:])
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_pipeline.py", line 210, in main
    cems = data_cleaning.clean_cems(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_cleaning.py", line 975, in clean_cems
    cems = remove_negative_cems_data(cems)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: remove_negative_cems_data() missing 1 required positional argument: 'year'

@rouille rouille self-requested a review December 13, 2024 19:27
@rouille
Copy link
Collaborator

rouille commented Dec 13, 2024

Another one:

Traceback (most recent call last):
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_pipeline.py", line 704, in <module>
    main(sys.argv[1:])
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_pipeline.py", line 210, in main
    cems = data_cleaning.clean_cems(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_cleaning.py", line 981, in clean_cems
    cems = remove_plants(
           ^^^^^^^^^^^^^^
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_cleaning.py", line 811, in remove_plants
    df = remove_unmapped_fuel(df, year)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_cleaning.py", line 927, in remove_unmapped_fuel
    validation.limit_error_output_df(potential_missing_map.to_string())
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/validation.py", line 2292, in limit_error_output_df
    return pd.concat([df.head(100), df.tail(100)], axis=0)
                      ^^^^^^^
AttributeError: 'str' object has no attribute 'head'

@rouille
Copy link
Collaborator

rouille commented Dec 13, 2024

Other one:

Traceback (most recent call last):
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_pipeline.py", line 704, in <module>
    main(sys.argv[1:])
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_pipeline.py", line 210, in main
    cems = data_cleaning.clean_cems(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_cleaning.py", line 979, in clean_cems
    cems = remove_plants(
           ^^^^^^^^^^^^^^
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_cleaning.py", line 811, in remove_plants
    df = remove_unmapped_fuel(df, year)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/data_cleaning.py", line 948, in remove_unmapped_fuel
    validation.limit_error_output_df(fuel_only_unmapped.to_string(index=False))
  File "/Users/brdo/Singularity/open-grid-emissions/src/oge/validation.py", line 2292, in limit_error_output_df
    return pd.concat([df.head(100), df.tail(100)], axis=0)
                      ^^^^^^^
AttributeError: 'str' object has no attribute 'head'

@grgmiller grgmiller merged commit d9791b7 into development Dec 13, 2024
2 checks passed
@grgmiller grgmiller deleted the greg/2023_cleanup branch December 13, 2024 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants