-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XBRL Extractor badzipfile error #251
Comments
Is your error reproducible? We've gotten this error very sporadically both in this data source and some others, and it seems to be something random -- like it works 99.9% of the time. If it is reporducible:
Also, are you just trying to access the FERC Form 1 data more generally? We publish complete extracted versions of the FERC Forms 1, 2, 6, 60, and 714. See the Nightly Builds section of the PUDL Data Access docs. The 2023 FERC Form 1 data is included as of 2 weeks ago. If you'd like to take the data for a spin without needing to set anything up, you can also go play with our example notebooks on Kaggle. The data there is updated once a week, and will also have the 2023 FERC data. Also, depending on what data you are trying to access in the Form 1, you may want to look at the tables which we've cleaned up and integrated into our main PUDL Database. It's only a few dozen out of the many that are available in the XBRL derived SQLite database, but they're way easier to work with, and are also integrated with the older DBF data going back to 1994. |
Thanks for responding to this, I am looking for access to the FERC Form 1 data from the previous years of 2020 - 2023. Where could I find the database for all of this? |
Download links can be found in the nightly builds section of the Data Access documentation. I would recommend first looking at the FERC Form 1 tables which have been integrated into our main PUDL database, since it covers all years of data (1994-2023) and is much cleaner and more usable than the original DBF and XBRL data. However, there are only a couple dozen tables in there, so what you need may not be in there. Any table whose name contains
If the table(s) you need have not been fully integrated into PUDL, then you will need to access the SQLite DBs that we produce which are just conversions of the old DBF and newer XBRL data formats into a modern relational database format:
You can also browse these databases online first if you want:
|
I am encountering a badzipfile error around the taxonomy file for FERC form 1. I am using the taxonomy file from the FERC website:
https://ecollection.ferc.gov/taxonomyHistory
Please let me know ASAP if you have any comments or ideas, and we can get to talking!
Error:
C:\Users\PEG Intern>xbrl_extract "C:\Users\PEG Intern\downloads\Puget Sound Files" --db-path "ferc1-2021-sample.sqlite" --taxonomy "C:\Users\PEG Intern\Downloads\Form 1_2023-04-01_976 (1).zip"
2024-08-01 15:18:27 [ INFO] catalystcoop.ferc_xbrl_extractor.xbrl:247 Parsing taxonomy from Form 1_2023-04-01_976/
Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Scripts\xbrl_extract.exe_main.py", line 7, in
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\site-packages\ferc_xbrl_extractor\cli.py", line 156, in main
return run_main(**vars(parse()))
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\site-packages\ferc_xbrl_extractor\cli.py", line 134, in run_main
extracted = xbrl.extract(
^^^^^^^^^^^^^
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\site-packages\ferc_xbrl_extractor\xbrl.py", line 58, in extract
table_defs = get_fact_tables(
^^^^^^^^^^^^^^^^
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\site-packages\ferc_xbrl_extractor\xbrl.py", line 254, in get_fact_tables
taxonomy = Taxonomy.from_source(f, entry_point=taxonomy_entry_point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\site-packages\ferc_xbrl_extractor\taxonomy.py", line 251, in from_source
taxonomy, view = load_taxonomy_from_archive(taxonomy_source, entry_point)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\site-packages\ferc_xbrl_extractor\arelle_interface.py", line 57, in load_taxonomy_from_archive
file_source = FileSource.openFileSource(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\site-packages\arelle\FileSource.py", line 44, in openFileSource
filesource.openZipStream(sourceZipStream)
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\site-packages\arelle\FileSource.py", line 351, in openZipStream
self.fs = zipfile.ZipFile(sourceZipStream, mode="r")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\zipfile_init.py", line 1349, in init
self.RealGetContents()
File "C:\Users\PEG Intern\AppData\Local\Programs\Python\Python312\Lib\zipfile_init.py", line 1416, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
The text was updated successfully, but these errors were encountered: