Skip to content

Commit

Permalink
✨ compute checksums from ingredients only (#2514)
Browse files Browse the repository at this point in the history
* ✨ compute checksums from ingredients only
  • Loading branch information
Marigold authored Apr 16, 2024
1 parent 86d0712 commit 01dba80
Show file tree
Hide file tree
Showing 7 changed files with 7 additions and 18 deletions.
5 changes: 4 additions & 1 deletion apps/owidbot/etldiff.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,10 @@ def cli(

nbranch = _normalise_branch(branch) if branch else "dry-run"

# TODO: only include site-screenshots if the PR is from owid-grapher. Similarly, don't
# run etl diff if the PR is from etl repo.
# - **Site-screenshots**: https://github.com/owid/site-screenshots/compare/{nbranch}

body = f"""
<details>
Expand All @@ -63,7 +67,6 @@ def cli(
- **Admin**: http://staging-site-{nbranch}/admin/login
- **Site**: http://staging-site-{nbranch}/
- **Login**: `ssh owid@staging-site-{nbranch}`
- **Site-screenshots**: https://github.com/owid/site-screenshots/compare/{nbranch}
</details>
<details>
Expand Down
10 changes: 3 additions & 7 deletions etl/steps/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -518,7 +518,8 @@ def _output_dataset(self) -> catalog.Dataset:
return catalog.Dataset(self._dest_dir.as_posix())

def checksum_output(self) -> str:
return self._output_dataset.checksum()
# output checksum is checksum of all ingredients
return self.checksum_input()

def _step_files(self) -> List[str]:
"Return a list of code files defining this step."
Expand Down Expand Up @@ -714,12 +715,7 @@ def has_existing_data(self) -> bool:
return True

def checksum_output(self) -> str:
# NOTE: we could use the checksum from `_dvc_path` to
# speed this up. Test the performance on
# time poetry run etl run garden --dry-run
# Make sure that the checksum below is the same as DVC checksum! It
# looks like it might be different for some reason
return files.checksum_file(self._dvc_path)
return Snapshot(self.path).m.outs[0]["md5"]

@property
def _dvc_path(self) -> str:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,6 @@ dataset:
Minimum and mean Southern Hemisphere daily ozone concentrations, measured in Dobson Units (DU).
This dataset should be next updated by the source every year. We will update it on Our World in Data soon after the new version is published. At the link above you can directly access the source page and see the latest available data.
licenses:
- name: # TO BE FILLED. Example: Testing License Name
url: # TO BE FILLED. Example: https://url_of_testing_source.com/license
sources:
- *source-testing

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ dataset:
This dataset provides information on military and civilian deaths from wars, drawn from the book by Dunnigan and Martel (1987).
licenses:
- name: Doubleday (1987)
url: # TO BE FILLED. Example: https://url_of_testing_source.com/license
sources:
- *source-testing

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ dataset:
This dataset provides information on military and civilian deaths from wars, drawn from the chapter by Eckhardt (1991).
licenses:
- name: World Priorities
url: # TO BE FILLED. Example: https://url_of_testing_source.com/license
sources:
- *source-testing

Expand Down
1 change: 0 additions & 1 deletion etl/steps/data/garden/war/2023-01-18/kaye_1985.meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ dataset:
This dataset provides information on direct and indirect military and civilian deaths from major armed conflicts, drawn from the report by Kaye et al. (1985).
licenses:
- name: Department of National Defence, Canada, Operational Research and Analysis Establishment, 1985
url: # TO BE FILLED. Example: https://url_of_testing_source.com/license
sources:
- *source-testing

Expand Down
4 changes: 0 additions & 4 deletions etl/steps/data/garden/war/2023-01-18/sutton_1971.meta.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ all_sources:
published_by: Sutton, Antony. 1972. Wars and Revolutions in the Nineteenth Century. Hoover Institution Archives.
url: https://searchworks.stanford.edu/view/3023823
date_accessed: 2023-01-09
publication_date: # TO BE FILLED. Example: 2023-01-01
publication_year: 1971
# description: Source description.

Expand All @@ -15,9 +14,6 @@ dataset:
version: 2023-01-18
description: |
This dataset provides information on deaths from wars and revolutions, using data from Sutton (1972).
licenses:
- name: Unknown
url: # TO BE FILLED. Example: https://url_of_testing_source.com/license
sources:
- *source-testing

Expand Down

0 comments on commit 01dba80

Please sign in to comment.