Add explainers, and minor datastore implements #29

connermanuel · 2022-02-16T17:10:38Z

Two things:

Adds LIME explainability to machine learning modules
Implement DS on fairness module

TODO:

Currently the behavior of the datastore is inconsistent, depending on whether or not you have Hadoop installed (Pyspark was looking for local files in HDFS by default on my machine).
Right now, I have the featurizer use the datastore data for its unweighted data, but to weight the data we use a trick where we make w copies of each record for w in weights. This makes a copy of the dataframe which is stored in memory. Ideally we wouldn't have to make an entirely new copy of the dataframe (and one with duplicated rows at that) because this would consume a lot of memory.
(Let me know if you'd like me to make issues for these)

…nner

LucioMelito

Looks good to me!

Two things:

It would be great if you could write a bit more in the docstrings, using the reStructuredText format; this will allow us to populate the API docs automatically
We started using pytest for testing instead of notebooks, although we only have them for the datastore at the moment; if you want to try and have tests like that for fairness it'd be great, but no pressure

Also go ahead and add those two issues.

LucioMelito · 2022-02-18T17:32:04Z

cider/datastore.py

@@ -82,7 +91,8 @@ def __init__(self, cfg_dir: str, spark: bool = True):
        self.mobiledata: SparkDataFrame
        self.mobilemoney: SparkDataFrame
        self.antennas: SparkDataFrame
-        self.shapefiles: Union[Dict[str, GeoDataFrame]] = {}
+        # TODO: was this supoposed to be a TypedDict?


That's supposed to be the dictionary of shapefiles for tower voronois and/or admin units. I think the union part is a mistake, and we can change Dict to Mapping

LucioMelito · 2022-02-18T17:33:55Z

cider/featurizer.py

@@ -29,7 +29,8 @@ def __init__(self,
                 clean_folders: bool = False) -> None:
        self.cfg = datastore.cfg
        self.ds = datastore
-        self.outputs = datastore.outputs + 'featurizer/'
+        self.outputs = datastore.outputs + '/featurizer/'
+        self.spark_outputs = datastore.spark_outputs + '/featurizer/'


Another thing we should do is change all these statements to os.path.join - feel free to do it or we can leave it for later

review-notebook-app · 2022-04-16T22:51:38Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

connermanuel added 14 commits November 3, 2021 12:16

update some imports and new config file

d40e7b2

(finally) add filestring and some hdfs compatibility

0aeecdd

update file location and explainer tests

d1b8d93

Merge branch 'master' into conner

fef2928

update explainability;

c8f528a

try reformatting input

805eb9e

working explainability module

50e9159

clean files

25130e2

Merge branch 'conner' of https://github.com/emilylaiken/cider into co…

a454ee2

…nner

compare continuous and discrete explanations

aef81cd

add explainability

2b1d2e0

add converttodf transformer to all pipelines

9f86464

fairness changes

7d9bc29

Merge branch 'master' into conner

57a3075

connermanuel requested review from LucioMelito and emilylaiken February 16, 2022 17:10

LucioMelito reviewed Feb 18, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add explainers, and minor datastore implements #29

Add explainers, and minor datastore implements #29

connermanuel commented Feb 16, 2022

LucioMelito left a comment

LucioMelito Feb 18, 2022

LucioMelito Feb 18, 2022

review-notebook-app bot commented Apr 16, 2022

Add explainers, and minor datastore implements #29

Are you sure you want to change the base?

Add explainers, and minor datastore implements #29

Conversation

connermanuel commented Feb 16, 2022

LucioMelito left a comment

Choose a reason for hiding this comment

LucioMelito Feb 18, 2022

Choose a reason for hiding this comment

LucioMelito Feb 18, 2022

Choose a reason for hiding this comment

review-notebook-app bot commented Apr 16, 2022