use pkg_resources for local files, fixes #27; update code structure; …

…update Readme, adding a full example
IMMM-SFA · Mar 31, 2021 · f783b7f · f783b7f
1 parent b2ab6b7
commit f783b7f
Show file tree

Hide file tree

Showing 28 changed files with 156 additions and 63 deletions.
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -0,0 +1,4 @@
+graft mosartwmpy/tests
+include mosartwmpy/*.yaml
+global-exclude *.py[cod]
+prune input output validation paper docs dask-worker-space **/.mypy_cache .github
diff --git a/README.md b/README.md
@@ -6,46 +6,111 @@
 
 ## getting started
 
-Install requirements with `pip install -r requirements.txt`.
+Install `mosartwmpy` with:
+```shell
+pip install mosartwmpy
+```
+
+Download a sample input dataset spanning 1980-1985 by running the following and selecting option `1`. This will download and unpack the inputs to your current directory. Note that this data is about 1.5GB in size.
 
-`mosartwmpy` implements the [Basic Model Interface](https://csdms.colorado.edu/wiki/BMI) defined by the CSDMS, so driving it should be familiar to those accustomed to the BMI:
+```shell
+python -m mosartwmpy.download
+```
+
+Settings are defined by the merger of the `mosartwmpy/config_defaults.yaml` and a user specified file which can override any of the default settings. Create a `config.yaml` file that defines your simulation:
+
+> `config.yaml`
+> ```yaml
+> simulation:
+>   name: tutorial
+>   start_date: 1981-05-24
+>   end_date: 1981-05-26
+> 
+> grid:
+>   path: ./input/domains/MOSART_NLDAS_8th_20160426.nc
+>   land:
+>     path: ./input/domains/domain.lnd.nldas2_0224x0464_c110415.nc
+> 
+> runoff:
+>   read_from_file: true
+>   path: ./input/runoff/Livneh_NLDAS_1980_1985.nc
+> 
+> water_management:
+>   enabled: true
+>   demand:
+>     read_from_file: true
+>     path: ./input/demand/RCP8.5_GCAM_water_demand_1980_1985.nc
+>   reservoirs:
+>     path: ./input/reservoirs/US_reservoir_8th_NLDAS3_updated_20200421.nc
+> ```
+
+`mosartwmpy` implements the [Basic Model Interface](https://csdms.colorado.edu/wiki/BMI) defined by the CSDMS, so driving it should be familiar to those accustomed to the BMI. To launch the simulation, open a python shell and run the following:
 
 ```python
-from datetime import datetime, time
-from mosartwmpy.mosartwmpy import Model
+from mosartwmpy import Model
+
+# path to the configuration yaml file
+config_file = "config.yaml"
 
 # initialize the model
 mosart_wm = Model()
-mosart_wm.initialize()
+mosart_wm.initialize(config_file)
 
 # advance the model one timestep
 mosart_wm.update()
 
-# advance until a specificed timestamp
-mosart_wm.update_until(datetime.combine(datetime(2030, 12, 31), time.max).timestamp())
+# advance until the `simulation.end_date` specified in config.yaml
+mosart_wm.update_until(mosart_wm.get_end_time())
 ```
 
-Settings are defined by the merger of the `config_defaults.yaml` and an optional user specified file which can override any of the default settings:
+Alternatively, one can update the settings via code in the driving script using dot notation:
 
 ```python
-mosart_wm = Model('path/to/config/file.yaml')
+from mosartwmpy import Model
+from datetime import datetime
+
+mosart_wm = Model()
+mosart_wm.initialize()
+
+mosart_wm.config['simulation.name'] = 'Tutorial'
+mosart_wm.config['simulation.start_date'] = datetime(1981, 5, 24)
+mosart_wm.config['simulation.end_date'] = datetime(1985, 5, 26)
+# etc...
 ```
 
-Alternatively, one can update the settings via code in the driving script:
+One can use the usual python plotting libraries to visualize data. Model state and output are stored as one-dimensional numpy ndarrays, so they must be reshaped to visualize two-dimensionally:
 
 ```python
- mosart_wm = Model()
- mosart_wm.initialize()
-
- mosart_wm.config['simulation.name'] = 'Water Management'
- mosart_wm.config['simulation.start_date'] = datetime(1981, 1, 1)
- mosart_wm.config['simulation.end_date'] = datetime(1985, 12, 31)
-```
+import xarray as xr
+import matplotlib.pyplot  as plt
+from mosartwmpy import Model
+
+mosart_wm = Model()
+mosart_wm.initialize('./config.yaml')
+
+mosart_wm.update_until(mosart_wm.get_end_time())
+
+surface_water = mosart_wm.get_value_ptr('surface_water_amount')
+
+# create an xarray from the data, which has some convenience wrappers for matplotlib methods
+data_array = xr.DataArray(
+    surface_water.reshape(mosart_wm.get_grid_shape()),
+    dims=['latitude', 'longitude'],
+    coords={'latitude': mosart_wm.get_grid_x(), 'longitude': mosart_wm.get_grid_y()},
+    name='Surface Water Amount',
+    attrs={'units': mosart_wm.get_var_units('surface_water_amount')}
+)
 
+# plot as a pcolormesh
+data_array.plot(robust=True, levels=32, cmap='winter_r')
+
+plt.show()
+
+```
 
 ## model input
 
-Several input files in NetCDF format are required to successfully run a simulation, which are not shipped with this repository due to their large size. The grid files, reservoir files, and a small range of runoff and demand input files can be obtained using the download utility by running `python download.py` in the repository root and choosing option 1 for "sample_input". Currently, all input files are assumed to be at the same resolution (for the sample files this is 1/8 degree over the CONUS). Below is a summary of the various input files:
+Several input files in NetCDF format are required to successfully run a simulation, which are not shipped with this repository due to their large size. The grid files, reservoir files, and a small range of runoff and demand input files can be obtained using the download utility by running `python -m mosartwmpy.download` and choosing option 1 for "sample_input". Currently, all input files are assumed to be at the same resolution (for the sample files this is 1/8 degree over the CONUS). Below is a summary of the various input files:
 
 <table>
 <thead>
@@ -134,16 +199,27 @@ Several input files in NetCDF format are required to successfully run a simulati
 
 Alternatively, certain model inputs can be set using the BMI interface. This can be useful for coupling `mosartwmpy` with other models. If setting an input that would typically be read from a file, be sure to disable the `read_from_file` configuration value for that input. For example:
 ```python
-    # get a list of model input variables
-    mosart_wm.get_input_var_names()
-
-    # disable the runoff read_from_file
-    mosart_wm.config['runoff.read_from_file'] = False
-
-    # set the runoff values manually (i.e. from another model's output)
-    surface_runoff = np.empty(mosart_wm.get_grid_size())
-    surface_runoff[:] = <values from coupled model>
-    mosart_wm.set_value('surface_runoff_flux', surface_runoff)
+import numpy as np
+from mosartwmpy import Model
+
+mosart_wm = Model()
+mosart_wm.initialize()
+
+# get a list of model input variables
+mosart_wm.get_input_var_names()
+
+# disable the runoff read_from_file
+mosart_wm.config['runoff.read_from_file'] = False
+
+# set the runoff values manually (i.e. from another model's output)
+surface_runoff = np.empty(mosart_wm.get_grid_size())
+surface_runoff[:] = # <values from coupled model>
+mosart_wm.set_value('surface_runoff_flux', surface_runoff)
+
+# advance one timestep
+mosart_wm.update()
+
+# continue coupling...
 ```
 
 ## model output
@@ -152,19 +228,25 @@ By default, key model variables are output on a monthly basis at a daily average
 
 Alternatively, certain model outputs deemed most important can be accessed using the BMI interface methods. For example:
 ```python
+import numpy as np
+from mosartwmpy import Model
+
+mosart_wm = Model()
+mosart_wm.initialize()
+
 # get a list of model output variables
 mosart_wm.get_output_var_names()
 
 # get the flattened numpy.ndarray of values for an output variable
-mosart_wm.get_value_ptr('supply_water_amount')
+supply = mosart_wm.get_value_ptr('supply_water_amount')
 ```
 
 ## testing and validation
 
-Before running the tests or validation, make sure to download the "sample_input" and "validation" datasets using the download utility `python download.py`.
+Before running the tests or validation, make sure to download the "sample_input" and "validation" datasets using the download utility `python -m mosartwmpy.download`.
 
 To execute the tests, run `./test.sh` or `python -m unittest discover mosartwmpy/tests` from the repository root.
 
-To execute the validation, run a model simulation that includes the years 1981 - 1982, note your output directory, and then run `./validation.sh` or `python validation/validate.py` from the repository root. This will ask you for the simulation output directory, think for a moment, and then open a figure with several plots representing the NMAE (Normalized Mean Absolute Error) as a percentage and the spatial sums of several key variables compared between your simulation and the validation scenario. Use these plots to assist you in determining if the changes you have made to the code have caused unintended deviation from the validation scenario. The NMAE should be 0% across time if you have caused no deviations. A non-zero NMAE indicates numerical difference between your simulation and the validation scenario. This might be caused by changes you have made to the code, or alternatively by running a simulation with different configuration or parameters (i.e. larger timestep, fewer iterations, etc). The plots of the spatial sums can assist you in determining what changed and the overall magnitude of the changes.
+To execute the validation, run a model simulation that includes the years 1981 - 1982, note your output directory, and then run `python -m mosartwmpy.validate` from the repository root. This will ask you for the simulation output directory, think for a moment, and then open a figure with several plots representing the NMAE (Normalized Mean Absolute Error) as a percentage and the spatial sums of several key variables compared between your simulation and the validation scenario. Use these plots to assist you in determining if the changes you have made to the code have caused unintended deviation from the validation scenario. The NMAE should be 0% across time if you have caused no deviations. A non-zero NMAE indicates numerical difference between your simulation and the validation scenario. This might be caused by changes you have made to the code, or alternatively by running a simulation with different configuration or parameters (i.e. larger timestep, fewer iterations, etc). The plots of the spatial sums can assist you in determining what changed and the overall magnitude of the changes.
 
 If you wish to merge code changes that intentionally cause significant deviation from the validation scenario, please work with the maintainers to create a new validation dataset.
diff --git a/mosartwmpy/__init__.py b/mosartwmpy/__init__.py
@@ -1 +1 @@
-from .model import Model
+from .model import Model
diff --git a/mosartwmpy/config/__init__.py b/mosartwmpy/config/__init__.py
diff --git a/mosartwmpy/config/config.py b/mosartwmpy/config/config.py
@@ -1,6 +1,8 @@
+import pkg_resources
 from benedict import benedict
 from benedict.dicts import benedict as Benedict
 
+
 def get_config(config_file_path: str) -> Benedict:
     """Configuration object for the model, using the Benedict type.
     
@@ -10,9 +12,8 @@ def get_config(config_file_path: str) -> Benedict:
     Returns:
         Benedict: A Benedict instance containing the merged configuration
     """
+    config = benedict(pkg_resources.resource_filename('mosartwmpy', 'config_defaults.yaml'), format='yaml')
+    if config_file_path is not None and config_file_path != '':
+        config.merge(benedict(str(config_file_path), format='yaml'), overwrite=True)
 
-    config = benedict('./config_defaults.yaml', format='yaml')
-    if config_file_path and config_file_path != '':
-        config.merge(benedict(config_file_path, format='yaml'), overwrite=True)
-
-    return config
+    return config
diff --git a/config_defaults.yaml → mosartwmpy/config_defaults.yaml b/config_defaults.yaml → mosartwmpy/config_defaults.yaml
diff --git a/mosartwmpy/direct_to_ocean/__init__.py b/mosartwmpy/direct_to_ocean/__init__.py
diff --git a/download.py → mosartwmpy/download.py b/download.py → mosartwmpy/download.py
@@ -1,11 +1,8 @@
-import enum
-import os
-
 from benedict import benedict
-
 from mosartwmpy.utilities.download_data import download_data
+import pkg_resources
 
-available_data = benedict.from_yaml('./mosartwmpy/data_manifest.yaml')
+available_data = benedict.from_yaml(pkg_resources.resource_filename('mosartwmpy', 'data_manifest.yaml'))
 
 data_list = []
 data = []
@@ -33,7 +30,7 @@
         0) exit
         
 """)
-try: 
+try:
     user_input = int(input("""
     Please select a number and press enter: """))
 except:
@@ -50,4 +47,3 @@
     print("")
     print("")
     download_data(data_list[user_input - 1])
-
diff --git a/mosartwmpy/flood/__init__.py b/mosartwmpy/flood/__init__.py
diff --git a/mosartwmpy/grid/__init__.py b/mosartwmpy/grid/__init__.py
diff --git a/mosartwmpy/grid/grid.py b/mosartwmpy/grid/grid.py
@@ -1,6 +1,7 @@
 import logging
 import numpy as np
 import pandas as pd
+from pathlib import Path
 import pickle
 import tempfile
 import xarray as xr
@@ -421,7 +422,7 @@ def from_files(path: str) -> 'Grid':
         Returns:
             Grid: a Grid instance populated with the columns from the dataframe
         """
-        if not path.endswith('.zip'):
+        if not Path(path).suffix == '.zip':
             path += '.zip'
 
         grid = Grid(empty=True)

diff --git a/mosartwmpy/hillslope/__init__.py b/mosartwmpy/hillslope/__init__.py
diff --git a/mosartwmpy/input/__init__.py b/mosartwmpy/input/__init__.py
diff --git a/mosartwmpy/main_channel/__init__.py b/mosartwmpy/main_channel/__init__.py
diff --git a/mosartwmpy/model.py b/mosartwmpy/model.py
@@ -76,7 +76,7 @@ def initialize(self, config_file_path: str, grid: Grid = None, state: State = No
             self.name = sanitize_filename(self.config.get('simulation.name')).replace(" ", "_")
             # setup logging and output directories
             Path(f'./output/{self.name}/restart_files').mkdir(parents=True, exist_ok=True)
-            handlers = [logging.FileHandler(f'./output/{self.name}/mosartwmpy.log')]
+            handlers = [logging.FileHandler(Path(f'./output/{self.name}/mosartwmpy.log'))]
             if self.config.get('simulation.log_to_std_out'):
                 handlers.append(logging.StreamHandler())
             logging.basicConfig(

diff --git a/mosartwmpy/output/__init__.py b/mosartwmpy/output/__init__.py
diff --git a/mosartwmpy/output/output.py b/mosartwmpy/output/output.py
@@ -103,7 +103,7 @@ def write_output(self):
 
     # if file exists and it's not a new period, update existing file else write to new file and include grid variables
     if not is_new_period and Path(filename).is_file():
-        nc = open_dataset(filename).load()
+        nc = open_dataset(Path(filename)).load()
         # slice the existing data to account for restarts
         nc = nc.sel(time=slice(None, pd.to_datetime(self.current_time) - pd.Timedelta('1ms')))
         frame = concat([nc, frame], dim='time', data_vars='minimal')
@@ -158,5 +158,5 @@ def write_restart(self):
 
     logging.info('Writing restart file.')
     x = self.state.to_dataframe().to_xarray()
-    filename = f'./output/{self.name}/restart_files/{self.name}_restart_{self.current_time.year}_{self.current_time.strftime("%m")}_{self.current_time.strftime("%d")}.nc'
+    filename = Path(f'./output/{self.name}/restart_files/{self.name}_restart_{self.current_time.year}_{self.current_time.strftime("%m")}_{self.current_time.strftime("%d")}.nc')
     x.to_netcdf(filename)
diff --git a/mosartwmpy/reservoirs/__init__.py b/mosartwmpy/reservoirs/__init__.py
diff --git a/mosartwmpy/state/__init__.py b/mosartwmpy/state/__init__.py
diff --git a/mosartwmpy/subnetwork/__init__.py b/mosartwmpy/subnetwork/__init__.py
diff --git a/mosartwmpy/tests/test_model.py b/mosartwmpy/tests/test_model.py
@@ -1,22 +1,22 @@
-import numpy as np
+from pathlib import Path
 import unittest
 
 from mosartwmpy import Model
 from mosartwmpy.grid.grid import Grid
-from mosartwmpy.state.state import State
 
 
 class ModelTest(unittest.TestCase):
     """Test that the model initializes and runs with the default settings."""
 
     def setUp(self):
         self.model = Model()
-        self.grid = Grid.from_files('./mosartwmpy/tests/grid.zip')
+        self.grid = Grid.from_files(Path('./mosartwmpy/tests/grid.zip'))
 
     def test_can_initialize_and_run(self):
-        self.model.initialize('./mosartwmpy/tests/test_config.yaml', grid = self.grid)
+        self.model.initialize(Path('./mosartwmpy/tests/test_config.yaml'), grid=self.grid)
         self.model.update()
         self.assertTrue(True, "model initializes and updates")
 
+
 if __name__ == '__main__':
-    unittest.main()
+    unittest.main()
diff --git a/mosartwmpy/update/__init__.py b/mosartwmpy/update/__init__.py
diff --git a/mosartwmpy/utilities/__init__.py b/mosartwmpy/utilities/__init__.py
diff --git a/mosartwmpy/utilities/download_data.py b/mosartwmpy/utilities/download_data.py
@@ -1,14 +1,16 @@
-import os
 import io
-import requests
-import zipfile
 import logging
+import os
+from pathlib import Path
+import pkg_resources
+import requests
 import sys
+import zipfile
 
 from benedict import benedict
 
 
-def download_data(dataset: str, destination: str = None, manifest: str = './mosartwmpy/data_manifest.yaml') -> None:
+def download_data(dataset: str, destination: str = None, manifest: str = pkg_resources.resource_filename('mosartwmpy', 'data_manifest.yaml')) -> None:
     """Convenience wrapper for the InstallSupplement class.
     
     Download and unpack example data supplement from Zenodo that matches the current installed
@@ -25,7 +27,10 @@ def download_data(dataset: str, destination: str = None, manifest: str = './mosa
     if not data_dictionary.get(dataset, None):
         raise Exception(f'Dataset "{dataset}" not found in the manifest ({manifest}).')
 
-    get = InstallSupplement(url = data_dictionary.get(f'{dataset}.url'), destination = destination if destination is not None else data_dictionary.get(f'{dataset}.destination', './'))
+    get = InstallSupplement(
+        url=data_dictionary.get(f'{dataset}.url'),
+        destination=destination if destination is not None else Path(data_dictionary.get(f'{dataset}.destination', './'))
+    )
     get.fetch_zenodo()
 
 

diff --git a/validation/validate.py → mosartwmpy/validate.py b/validation/validate.py → mosartwmpy/validate.py
@@ -1,14 +1,15 @@
 import matplotlib.pyplot as plt
 import numpy as np
 import os
+from pathlib import Path
 from xarray import open_mfdataset
 
 # TODO accept command line path input as alternative
 # TODO accept command line year input
 # TODO allow easily toggling between scenarios for variables of interest (no-wm, wm, heat, etc)
 
 years = [1981, 1982]
-baseline_data_path = 'validation/mosartwmpy_validation_wm_1981_1982.nc'
+baseline_data_path = Path('validation/mosartwmpy_validation_wm_1981_1982.nc')
 variables_of_interest = ['STORAGE_LIQ', 'RIVER_DISCHARGE_OVER_LAND_LIQ', 'WRM_STORAGE', 'WRM_SUPPLY']
 physical_dimensions = ['lat', 'lon']
 temporal_dimension = 'time'
@@ -29,7 +30,7 @@
 data = open_mfdataset(f"{data_path}/*.nc" if data_path[-3:] != '.nc' else data_path)
 
 try:
-    data = data.sel({ temporal_dimension: slice(f"{years[0]}", f"{years[-1]}") })
+    data = data.sel({temporal_dimension: slice(f"{years[0]}", f"{years[-1]}")})
     timeslice = slice(data[temporal_dimension].values[0], data[temporal_dimension].values[len(data[temporal_dimension].values) - 1])
     baseline_data = open_mfdataset(baseline_data_path)
     baseline_data = baseline_data.sel({ temporal_dimension: timeslice })
@@ -75,4 +76,4 @@
 figure.text(0.5, 0.04, 'time', ha='center')
 figure.text(0.04, 0.5, 'NMAE (%)', va='center', rotation='vertical')
 figure.tight_layout()
-plt.show()
+plt.show()