Skip to content

Commit

Permalink
Merge pull request #64 from databrickslabs/cran-fixes-v0.2.4
Browse files Browse the repository at this point in the history
Adjustments in preparation for CRAN resubmission
  • Loading branch information
RafiKurlansik authored Aug 29, 2024
2 parents dfe4098 + 45a39c9 commit 46cd34a
Show file tree
Hide file tree
Showing 15 changed files with 35 additions and 176 deletions.
13 changes: 7 additions & 6 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: brickster
Title: R Toolkit for Databricks
Version: 0.2.4
Title: R Toolkit for 'Databricks'
Version: 0.2.5
Authors@R:
c(
person(given = "Zac",
Expand All @@ -13,9 +13,10 @@ Authors@R:
email = "[email protected]"),
person("Databricks", role = c("cph", "fnd"))
)
Description: Collection of utilities that improve using Databricks from R.
Primarily functions that wrap specific Databricks APIs, RStudio connection
pane support, quality of life functions to make Databricks simpler to use.
Description: Collection of utilities that improve using 'Databricks' from R.
Primarily functions that wrap specific 'Databricks' APIs
(<https://docs.databricks.com/api>), 'RStudio' connection pane support, quality
of life functions to make 'Databricks' simpler to use.
License: Apache License (>= 2)
Encoding: UTF-8
LazyData: true
Expand Down Expand Up @@ -49,5 +50,5 @@ Suggests:
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.1
VignetteBuilder: knitr
URL: https://github.com/zacdav-db/brickster
URL: https://github.com/databrickslabs/brickster
Config/testthat/edition: 3
2 changes: 0 additions & 2 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -200,9 +200,7 @@ export(lib_pypi)
export(lib_whl)
export(libraries)
export(new_cluster)
export(notebook_enable_htmlwidgets)
export(notebook_task)
export(notebook_use_posit_repo)
export(open_workspace)
export(pipeline_task)
export(py_db_sql_connector)
Expand Down
2 changes: 1 addition & 1 deletion R/data-structures.R
Original file line number Diff line number Diff line change
Expand Up @@ -840,7 +840,7 @@ is.email_notifications <- function(x) {
#'
#' @param quartz_cron_expression Cron expression using Quartz syntax that
#' describes the schedule for a job.
#' See [Cron Trigger](http://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html)
#' See [Cron Trigger](https://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html)
#' for details.
#' @param timezone_id Java timezone ID. The schedule for a job is resolved with
#' respect to this timezone.
Expand Down
13 changes: 0 additions & 13 deletions R/databricks-helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,6 @@ on_databricks <- function() {
dbr != ""
}

in_databricks_nb <- function() {
("/databricks/spark/R/lib" %in% .libPaths()) &&
exists("DATABRICKS_GUID", envir = .GlobalEnv)
}

use_posit_repo <- function() {
if (in_databricks_nb()) {
codename <- system("lsb_release -c --short", intern = T)
mirror <- paste0("https://packagemanager.posit.co/cran/__linux__/", codename, "/latest")
options(repos = c(POSIT = mirror))
}
}

#' Determine brickster virtualenv
#'
#' @details Returns `NULL` when running within Databricks,
Expand Down
15 changes: 11 additions & 4 deletions R/knitr-engines.R
Original file line number Diff line number Diff line change
Expand Up @@ -96,10 +96,8 @@ clean_command_results <- function(x, options, language) {
if (options$eval) {
schema <- data.table::rbindlist(x$results$schema)
tbl <- data.table::rbindlist(x$results$data)

names(tbl) <- schema$name
if (!is.null(options$keep_as)) {
base::assign(options$keep_as, value = tbl, envir = .GlobalEnv)
}
if (isTRUE(getOption('knitr.in.progress'))) {
outputs$table <- knitr::engine_output(
options = options,
Expand All @@ -109,6 +107,12 @@ clean_command_results <- function(x, options, language) {
knitr::knit_print(tbl)
}

# when `output.var` option is used return the table assigned to object
varname <- options$output.var
if (!is.null(varname)) {
assign(varname, tbl, envir = knitr::knit_global())
}

}

return(do.call(paste, outputs))
Expand Down Expand Up @@ -139,11 +143,14 @@ clean_command_results <- function(x, options, language) {
if (isTRUE(getOption('knitr.in.progress'))) {
outputs$plot <- knitr::engine_output(
options = options,
out = list(knitr::include_graphics(path = file))
out = list(knitr::include_graphics(path = file, dpi = options$dpi))
)
} else {
res <- structure(file, class = c("knit_image_paths", "knit_asis"), dpi = options$dpi)
print(res)
# img <- magick::image_read(raw)
# grid::grid.newpage()
# grid::grid.raster(img)
}
}

Expand Down
79 changes: 0 additions & 79 deletions R/notebook-helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,82 +10,3 @@ in_databricks_nb <- function() {
("/databricks/spark/R/lib" %in% .libPaths()) &&
exists("DATABRICKS_GUID", envir = .GlobalEnv)
}

#' Setup Databricks Notebook with Posit Package Manager
#'
#' @details
#' Databricks notebooks default repo for package installation is CRAN.
#' CRAN doesn't provide pre-compiled binaries for linux and this results in
#' packages taking longer than desired.
#'
#' This function can be called within a Databricks notebook to easily switch to
#' Posit and retrieve pre-compiled binaries.
#'
#' This function will behave correctly across different Databricks Runtimes,
#' even when the underlying linux version changes.
#'
#' @export
notebook_use_posit_repo <- function() {
if (in_databricks_nb()) {
agent <- sprintf("R/%s R (%s)", getRversion(), paste(getRversion(), R.version["platform"], R.version["arch"], R.version["os"]))
codename <- system("lsb_release -c --short", intern = T)
mirror <- paste0("https://packagemanager.posit.co/cran/__linux__/", codename, "/latest")
options(
HTTPUserAgent = agent,
repos = c(POSIT = mirror, getOption("repos"))
)
}
}

#' Enable htmlwidgets in Databricks Notebook
#'
#' @details
#' Databricks notebooks by default don't currently support htmlwidgets.
#' This behaviour can be corrected by:
#' - adjusting the print method in htmltools
#' - installing pandoc
#'
#' This is a invasive method to correct the behaviour as htmltools isn't
#' flexible to adjust via the `viewer` option directly.
#'
#' It only runs within a Databricks notebook cell.
#'
#' The height can be adjusted without running the function again by using the
#' `db_htmlwidget_height` option (e.g. `options(db_htmlwidget_height = 300)`).
#'
#'
#' @param height Measurement passed to height of htmlwidget. This overrides
#' existing values that may often be `NULL` to ensure the height is correctly
#' displayed within the iframe of notebook results cells (via `displayHTML()`).
#' Default is 450.
#'
#' @export
#'
#' @examples
#' notebook_enable_htmlwidgets()
#' # set default height to 800px
#' notebook_enable_htmlwidgets(height = 800)
notebook_enable_htmlwidgets <- function(height = 450) {
if (in_databricks_nb()) {

# new option to control default widget height, default is 450px
options(db_htmlwidget_height = height)

system("apt-get --yes install pandoc", intern = T)
if (!base::require("htmlwidgets")) {
utils::install.packages("htmlwidgets")
}

# new method will fetch height based on new option, or default to 450px
new_method <- function(x, ...) {
x$height <- getOption("db_htmlwidget_height", 450)
file <- tempfile(fileext = ".html")
htmlwidgets::saveWidget(x, file = file)
contents <- as.character(rvest::read_html(file))
displayHTML(contents)
}

utils::assignInNamespace("print.htmlwidget", new_method, ns = "htmlwidgets")
invisible(list(default_height = height, print = new_method))
}
}
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# [brickster](https://databrickslabs.github.io/brickster/) <a href='https://zacdav-db.github.io/brickster/'><img src="man/figures/logo.png" align="right" height="139"/></a>
# [brickster](https://databrickslabs.github.io/brickster/) <a href='https://databrickslabs.github.io/brickster/'><img src="man/figures/logo.png" align="right" height="139"/></a>

<!-- badges: start -->

[![R-CMD-check](https://github.com/zacdav-db/brickster/workflows/R-CMD-check/badge.svg)](https://github.com/zacdav-db/brickster/actions) [![Codecov test coverage](https://codecov.io/gh/zacdav-db/brickster/branch/main/graph/badge.svg)](https://app.codecov.io/gh/zacdav-db/brickster?branch=main)
[![R-CMD-check](https://github.com/databrickslabs/brickster/workflows/R-CMD-check/badge.svg)](https://github.com/databrickslabs/brickster/actions) [![Codecov test coverage](https://codecov.io/gh/zacdav-db/brickster/branch/main/graph/badge.svg)](https://app.codecov.io/gh/zacdav-db/brickster?branch=main)

<!-- badges: end -->

Expand Down
2 changes: 0 additions & 2 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,6 @@ reference:
- title: Databricks Notebook Helpers
contents:
- in_databricks_nb
- notebook_use_posit_repo
- notebook_enable_htmlwidgets
- title: DBFS
contents: starts_with("db_dbfs", internal = TRUE)
- title: Volume FileSystem
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,12 @@ print("hello from Databricks")
```

```{python, engine = "databricks_py"}
# install folium
!pip install folium
```


```{python, engine = "databricks_py"}
import folium
m = folium.Map(location=[45.5236, -122.6750])
Expand Down
2 changes: 1 addition & 1 deletion man/cron_schedule.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

38 changes: 0 additions & 38 deletions man/notebook_enable_htmlwidgets.Rd

This file was deleted.

22 changes: 0 additions & 22 deletions man/notebook_use_posit_repo.Rd

This file was deleted.

2 changes: 0 additions & 2 deletions tests/testthat/test-notebook-helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,5 @@ test_that("Databricks Notebook Helpers", {

# currently running tests outside of a databricks notebook
expect_false(in_databricks_nb())
expect_no_error(notebook_use_posit_repo())
expect_no_error(notebook_enable_htmlwidgets())

})
4 changes: 2 additions & 2 deletions vignettes/rmarkdown-databricks-notebook.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -126,10 +126,10 @@ Results that are detected as tabular in **any** `databricks_*` chunk will be ren

### Persisting Tabular Results

When a result is rendered as a table you can persist a copy to the R session (`.GlobalEnv`) by using the `keep_as` chunk option
When a result is rendered as a table you can persist a copy to the R session (`.GlobalEnv`) by using the `output.var` chunk option

```` r
`r ''````{sql, engine = "databricks_sql", keep_as = "tables"}
`r ''````{sql, engine = "databricks_sql", output.var = "tables"}
show databases
```
````
Expand Down
7 changes: 5 additions & 2 deletions vignettes/setup-auth.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,16 @@ knitr::opts_chunk$set(
The `{brickster}` package connects to a Databricks workspace is two ways:

1. [OAuth user-to-machine (U2M) authentication](https://docs.databricks.com/en/dev-tools/auth/oauth-u2m.html#oauth-user-to-machine-u2m-authentication)
2. [Personal Access Tokens (PAT)](https://docs.databricks.com/en/dev-tools/auth/pat.htmlhttps://docs.databricks.com/en/dev-tools/auth/pat.html)
2. [Personal Access Tokens (PAT)](https://docs.databricks.com/en/dev-tools/auth/pat.html)

It's recommended to use option (1) when using `{brickster}` interactively, if you need to run code via an automated process the only option currently is (2).

Personal Access Tokens can be generated in a few steps, for a step-by-step breakdown [refer to the documentation](https://docs.databricks.com/dev-tools/api/latest/authentication.html).

Once you have a token you'll be able to store it alongside the workspace URL in an `.Renviron` file. The `.Renviron` is used for storing the variables, such as those which may be sensitive (e.g. credentials) and de-couple them from the code (additional reading: [1](https://support.rstudio.com/hc/en-us/articles/360047157094-Managing-R-with-Rprofile-Renviron-Rprofile-site-Renviron-site-rsession-conf-and-repos-conf), [2](https://cran.r-project.org/web/packages/startup/vignettes/startup-intro.html)).
Once you have a token you'll be able to store it alongside the workspace URL in an `.Renviron` file. The `.Renviron` is used for storing the variables, such as those which may be sensitive (e.g. credentials) and de-couple them from the code (additional reading: [1](https://support.posit.co/hc/en-us/articles/360047157094-Managing-R-with-Rprofile-Renviron-Rprofile-site-Renviron-site-rsession-conf-and-repos-conf), [2](https://CRAN.R-project.org/package=startup/vignettes/startup-intro.html)).




To get started add the following to your `.Renviron`:

Expand Down

0 comments on commit 46cd34a

Please sign in to comment.