-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME.Rmd
136 lines (103 loc) · 4.02 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# kflow
<!-- badges: start -->
[![R build status](https://github.com/ndiquattro/kflow/workflows/R-CMD-check/badge.svg)](https://github.com/ndiquattro/kflow)
<!-- badges: end -->
The ambition of kflow is to make it easier to build R based components orchestrated by Google's [Kubeflow](https://www.kubeflow.org/). Importantly, this package does *not* intend to be a full R replacement for the [python SDK](https://github.com/kubeflow/pipelines) (at least not yet!). However, I've had some good luck in wrapping the python SDK with [reticulate](https://rstudio.github.io/reticulate/), so if you need to go full R, that would be a good option.
## Installation
You can install the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("ndiquattro/kflow")
```
## Example Usage
To illustrate how to use {kflow} we'll set up a simple component example where we predict the transmission type of a car in `mtcars` based on an input parameter. We will work with a single function that will eventually be translated to a single kubeflow component.
Note that our argument names need to follow a convention for the conversion to component to succeed. Each argument must end in a slug that identifies the argument type. The conversions for slug to kubeflow type are:
*Inputs*
* _string = String
* _int = Integer
* _bool = Bool
* _float = Float
*Outputs*
* _out = outputPath
* _metrics = Metrics
* _uimeta = UI_metadata
With all that defined, let's create the function:
```{r}
library(kflow)
tm_predict <- function(predictor_string, file_out, performance_metrics, curve_uimeta) {
# Train Model
cars_dat <- mtcars
cars_dat$am <- factor(cars_dat$am)
form <- as.formula(paste0("am ~ ", predictor_string))
model <- glm(form, binomial, cars_dat)
# Make Predictions
cars_dat$prob_auto <- predict(model, type = "response")
# Save results
kf_write_output(cars_dat, file_out) # This ensures the path exists then writes to a kubeflow provided path
# Score and save metrics
kf_init_metrics() %>% # Start an empy JSON
kf_add_metric(name = "roc", value = yardstick::roc_auc(cars_dat, am, prob_auto)$.estimate, format = "RAW") %>%
kf_add_metric(name = "pr-auc", value = yardstick::pr_auc(cars_dat, am, prob_auto)$.estimate, format = "RAW") %>%
kf_write_output(curve_uimeta)
# Save ROC Curve
roc_file <- tempfile()
yardstick::roc_curve(test_preds_org, observed, estimated) %>%
mutate(specificity = 1 - specificity) %>% # convert to FPR
filter(is.finite(.threshold)) %>% # KF not going to like -Inf to Inf
write.csv(roc_file, col_names = FALSE) # Save without headers
kf_init_ui_meta() %>%
kf_add_roc(roc_file)
}
```
```{r}
component <-
kf_make_component(
"tm_predict",
"Transmission Predictor",
"Predicts if a car has an automatic transmission based on a provided variable",
"rocker/tidyverse:3.6.2"
)
cat(component, sep = "\n")
```
Next let's take a look at an example of how the metrics/ui meta functions work. Essentially they are just helpers for creating JSON in a structure kubeflow expects. They can be written by `kf_write_output()` just like any other information we want to save.
You can also inspect the JSON as you go. First create the base:
```{r}
base_metrics <- kf_init_metrics()
base_metrics
```
Then add a metric:
```{r}
base_metrics %>%
kf_add_metric(
name = "coolness-factor",
value = 100,
format = "RAW"
)
```
You can chain as many metrics together as you'd like:
```{r}
base_metrics %>%
kf_add_metric(
name = "coolness-factor",
value = 100,
format = "RAW"
) %>%
kf_add_metric(
name = "badness-factor",
value = 0,
format = "RAW"
)
```
When written to a `_metrics` or `_uimeta` path they will show up in the kubeflow UI!