Skip to content

Commit

Permalink
Add with_columns for pandas hello_world example
Browse files Browse the repository at this point in the history
  • Loading branch information
jernejfrank authored and elijahbenizzy committed Nov 6, 2024
1 parent 3c4bc70 commit 5f3ac72
Show file tree
Hide file tree
Showing 4 changed files with 777 additions and 0 deletions.
Binary file added examples/pandas/with_columns/DAG.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 7 additions & 0 deletions examples/pandas/with_columns/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Using with_columns with Pandas

We show the ability to use the familiar `with_columns` from either `pyspark` or `polars` on a Pandas dataframe.

To see the example look at the notebook.

![image info](./dag.png)
47 changes: 47 additions & 0 deletions examples/pandas/with_columns/my_functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
import pandas as pd

from hamilton.function_modifiers import config

"""
Notes:
1. This file is used for all the [ray|dask|spark]/hello_world examples.
2. It therefore show cases how you can write something once and not only scale it, but port it
to different frameworks with ease!
"""


@config.when(case="millions")
def avg_3wk_spend__millions(spend: pd.Series) -> pd.Series:
"""Rolling 3 week average spend."""
return spend.rolling(3).mean() / 1e6


@config.when(case="thousands")
def avg_3wk_spend__thousands(spend: pd.Series) -> pd.Series:
"""Rolling 3 week average spend."""
return spend.rolling(3).mean() / 1e3


def spend_per_signup(spend: pd.Series, signups: pd.Series) -> pd.Series:
"""The cost per signup in relation to spend."""
return spend / signups


def spend_mean(spend: pd.Series) -> float:
"""Shows function creating a scalar. In this case it computes the mean of the entire column."""
return spend.mean()


def spend_zero_mean(spend: pd.Series, spend_mean: float) -> pd.Series:
"""Shows function that takes a scalar. In this case to zero mean spend."""
return spend - spend_mean


def spend_std_dev(spend: pd.Series) -> float:
"""Function that computes the standard deviation of the spend column."""
return spend.std()


def spend_zero_mean_unit_variance(spend_zero_mean: pd.Series, spend_std_dev: float) -> pd.Series:
"""Function showing one way to make spend have zero mean and unit variance."""
return spend_zero_mean / spend_std_dev
Loading

0 comments on commit 5f3ac72

Please sign in to comment.