Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs][R] added R-package docs generation routines #2176

Merged
merged 20 commits into from
Sep 1, 2019
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .readthedocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 2
formats:
- pdf
python:
version: 3
install:
- requirements: docs/requirements.txt
sphinx:
builder: html
configuration: docs/conf.py
fail_on_warning: true
2 changes: 2 additions & 0 deletions R-package/.Rbuildignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
^build_package.R$
\.gitkeep$
^docs$
^_pkgdown\.yml$

# Objects created by compilation
\.o$
Expand Down
1 change: 1 addition & 0 deletions R-package/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Description: Tree based algorithms can be improved by introducing boosting frame
5. Capable of handling large-scale data.
In recognition of these advantages, LightGBM has being widely-used in many winning solutions of machine learning competitions.
Comparison experiments on public datasets suggest that LightGBM can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. In addition, parallel experiments suggest that in certain circumstances, LightGBM can achieve a linear speed-up in training time by using multiple machines.
Encoding: UTF-8
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

Warning message:
roxygen2 requires Encoding: UTF-8 

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, a legendary roxygen2 error 😀

License: MIT + file LICENSE
URL: https://github.com/Microsoft/LightGBM
BugReports: https://github.com/Microsoft/LightGBM/issues
Expand Down
21 changes: 10 additions & 11 deletions R-package/R/lgb.Booster.R
Original file line number Diff line number Diff line change
Expand Up @@ -644,11 +644,11 @@ Booster <- R6::R6Class(
#' valids <- list(test = dtest)
#' model <- lgb.train(params,
#' dtrain,
#' 100,
#' 10,
#' valids,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#' early_stopping_rounds = 5)
#' preds <- predict(model, test$data)
#'
#' @rdname predict.lgb.Booster
Expand Down Expand Up @@ -701,11 +701,11 @@ predict.lgb.Booster <- function(object,
#' valids <- list(test = dtest)
#' model <- lgb.train(params,
#' dtrain,
#' 100,
#' 10,
#' valids,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#' early_stopping_rounds = 5)
#' lgb.save(model, "model.txt")
#' load_booster <- lgb.load(filename = "model.txt")
#' model_string <- model$save_model_to_string(NULL) # saves best iteration
Expand Down Expand Up @@ -759,11 +759,11 @@ lgb.load <- function(filename = NULL, model_str = NULL){
#' valids <- list(test = dtest)
#' model <- lgb.train(params,
#' dtrain,
#' 100,
#' 10,
#' valids,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#' early_stopping_rounds = 5)
#' lgb.save(model, "model.txt")
#'
#' @rdname lgb.save
Expand Down Expand Up @@ -806,11 +806,11 @@ lgb.save <- function(booster, filename, num_iteration = NULL){
#' valids <- list(test = dtest)
#' model <- lgb.train(params,
#' dtrain,
#' 100,
#' 10,
#' valids,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#' early_stopping_rounds = 5)
#' json_model <- lgb.dump(model)
#'
#' @rdname lgb.dump
Expand Down Expand Up @@ -850,13 +850,12 @@ lgb.dump <- function(booster, num_iteration = NULL){
#' valids <- list(test = dtest)
#' model <- lgb.train(params,
#' dtrain,
#' 100,
#' 10,
#' valids,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#' early_stopping_rounds = 5)
#' lgb.get.eval.result(model, "test", "l2")
#'
#' @rdname lgb.get.eval.result
#' @export
lgb.get.eval.result <- function(booster, data_name, eval_name, iters = NULL, is_err = FALSE) {
Expand Down
2 changes: 0 additions & 2 deletions R-package/R/lgb.Dataset.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

#' @importFrom methods is
#' @importFrom R6 R6Class
Dataset <- R6::R6Class(
Expand Down Expand Up @@ -1057,7 +1056,6 @@ lgb.Dataset.set.reference <- function(dataset, reference) {
#' @return passed dataset
#'
#' @examples
#'
#' library(lightgbm)
#' data(agaricus.train, package = "lightgbm")
#' train <- agaricus.train
Expand Down
1 change: 0 additions & 1 deletion R-package/R/lgb.Predictor.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

#' @importFrom methods is
#' @importFrom R6 R6Class
Predictor <- R6::R6Class(
Expand Down
4 changes: 2 additions & 2 deletions R-package/R/lgb.cv.R
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,10 @@ CVBooster <- R6::R6Class(
#' model <- lgb.cv(params,
#' dtrain,
#' 10,
#' nfold = 5,
#' nfold = 3,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#' early_stopping_rounds = 5)
#' @export
lgb.cv <- function(params = list(),
data,
Expand Down
7 changes: 3 additions & 4 deletions R-package/R/lgb.importance.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,9 @@
#' dtrain <- lgb.Dataset(train$data, label = train$label)
#'
#' params <- list(objective = "binary",
#' learning_rate = 0.01, num_leaves = 63, max_depth = -1,
#' min_data_in_leaf = 1, min_sum_hessian_in_leaf = 1)
#' model <- lgb.train(params, dtrain, 20)
#' model <- lgb.train(params, dtrain, 20)
#' learning_rate = 0.01, num_leaves = 63, max_depth = -1,
#' min_data_in_leaf = 1, min_sum_hessian_in_leaf = 1)
#' model <- lgb.train(params, dtrain, 10)
#'
#' tree_imp1 <- lgb.importance(model, percentage = TRUE)
#' tree_imp2 <- lgb.importance(model, percentage = FALSE)
Expand Down
2 changes: 1 addition & 1 deletion R-package/R/lgb.interprete.R
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
#' , min_data_in_leaf = 1
#' , min_sum_hessian_in_leaf = 1
#' )
#' model <- lgb.train(params, dtrain, 20)
#' model <- lgb.train(params, dtrain, 10)
#'
#' tree_interpretation <- lgb.interprete(model, test$data, 1:5)
#'
Expand Down
7 changes: 3 additions & 4 deletions R-package/R/lgb.model.dt.tree.R
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,9 @@
#' dtrain <- lgb.Dataset(train$data, label = train$label)
#'
#' params <- list(objective = "binary",
#' learning_rate = 0.01, num_leaves = 63, max_depth = -1,
#' min_data_in_leaf = 1, min_sum_hessian_in_leaf = 1)
#' model <- lgb.train(params, dtrain, 20)
#' model <- lgb.train(params, dtrain, 20)
#' learning_rate = 0.01, num_leaves = 63, max_depth = -1,
#' min_data_in_leaf = 1, min_sum_hessian_in_leaf = 1)
#' model <- lgb.train(params, dtrain, 10)
#'
#' tree_dt <- lgb.model.dt.tree(model)
#'
Expand Down
2 changes: 1 addition & 1 deletion R-package/R/lgb.plot.importance.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
#' , min_sum_hessian_in_leaf = 1
#' )
#'
#' model <- lgb.train(params, dtrain, 20)
#' model <- lgb.train(params, dtrain, 10)
#'
#' tree_imp <- lgb.importance(model, percentage = TRUE)
#' lgb.plot.importance(tree_imp, top_n = 10, measure = "Gain")
Expand Down
7 changes: 3 additions & 4 deletions R-package/R/lgb.plot.interpretation.R
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,9 @@
#' test <- agaricus.test
#'
#' params <- list(objective = "binary",
#' learning_rate = 0.01, num_leaves = 63, max_depth = -1,
#' min_data_in_leaf = 1, min_sum_hessian_in_leaf = 1)
#' model <- lgb.train(params, dtrain, 20)
#' model <- lgb.train(params, dtrain, 20)
#' learning_rate = 0.01, num_leaves = 63, max_depth = -1,
#' min_data_in_leaf = 1, min_sum_hessian_in_leaf = 1)
#' model <- lgb.train(params, dtrain, 10)
#'
#' tree_interpretation <- lgb.interprete(model, test$data, 1:5)
#' lgb.plot.interpretation(tree_interpretation[[1]], top_n = 10)
Expand Down
2 changes: 2 additions & 0 deletions R-package/R/lgb.prepare.R
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#' # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#' # $ Species : num 1 1 1 1 1 1 1 1 1 1 ...
#'
#' \dontrun{
#' # When lightgbm package is installed, and you do not want to load it
#' # You can still use the function!
#' lgb.unloader()
Expand All @@ -36,6 +37,7 @@
#' # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#' # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#' # $ Species : num 1 1 1 1 1 1 1 1 1 1 ...
#' }
#'
#' @export
lgb.prepare <- function(data) {
Expand Down
2 changes: 2 additions & 0 deletions R-package/R/lgb.prepare2.R
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#' # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#' # $ Species : int 1 1 1 1 1 1 1 1 1 1 ...
#'
#' \dontrun{
#' # When lightgbm package is installed, and you do not want to load it
#' # You can still use the function!
#' lgb.unloader()
Expand All @@ -37,6 +38,7 @@
#' # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#' # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#' # $ Species : int 1 1 1 1 1 1 1 1 1 1 ...
#' }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StrikerRUS why was this necessary? This will generate a block in the rendered PDF that says "## don't run" (user-facing)

I think \donttest{} would be more appropriate, depending on what caused you to add this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb I added this because after calling lgb.unloader() all examples crash. Please take a look at our previous conversation with @Laurae2 : 3 comments starting from #2176 (comment).
Please help to choose the appropriate guard which will prevent running these pieces of code during documentation (examples part) generation process.

#'
#' @export
lgb.prepare2 <- function(data) {
Expand Down
5 changes: 2 additions & 3 deletions R-package/R/lgb.train.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,11 @@
#' valids <- list(test = dtest)
#' model <- lgb.train(params,
#' dtrain,
#' 100,
#' 10,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you reduce all of these parameters? Was it just to cut the build time on examples?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb Exactly! BTW, ideally, params for examples should be tuned, because currently we have awful training with huge number of

[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements

#' valids,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#'
#' early_stopping_rounds = 5)
#' @export
lgb.train <- function(params = list(),
data,
Expand Down
7 changes: 5 additions & 2 deletions R-package/R/lgb.unloader.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,20 @@
#' valids <- list(test = dtest)
#' model <- lgb.train(params,
#' dtrain,
#' 100,
#' 10,
#' valids,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#' early_stopping_rounds = 5)
#'
#' \dontrun{
#' lgb.unloader(restore = FALSE, wipe = FALSE, envir = .GlobalEnv)
#' rm(model, dtrain, dtest) # Not needed if wipe = TRUE
#' gc() # Not needed if wipe = TRUE
#'
#' library(lightgbm)
#' # Do whatever you want again with LightGBM without object clashing
#' }
#'
#' @export
lgb.unloader <- function(restore = TRUE, wipe = FALSE, envir = .GlobalEnv) {
Expand Down
1 change: 0 additions & 1 deletion R-package/R/lightgbm.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

#' @name lgb_shared_params
#' @title Shared parameter docs
#' @description Parameter docs shared by \code{lgb.train}, \code{lgb.cv}, and \code{lightgbm}
Expand Down
4 changes: 2 additions & 2 deletions R-package/R/readRDS.lgb.Booster.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@
#' valids <- list(test = dtest)
#' model <- lgb.train(params,
#' dtrain,
#' 100,
#' 10,
#' valids,
#' min_data = 1,
#' learning_rate = 1,
#' early_stopping_rounds = 10)
#' early_stopping_rounds = 5)
#' saveRDS.lgb.Booster(model, "model.rds")
#' new_model <- readRDS.lgb.Booster("model.rds")
#'
Expand Down
4 changes: 2 additions & 2 deletions R-package/R/saveRDS.lgb.Booster.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@
#' model <- lgb.train(
#' params
#' , dtrain
#' , 100
#' , 10
#' , valids
#' , min_data = 1
#' , learning_rate = 1
#' , early_stopping_rounds = 10
#' , early_stopping_rounds = 5
#' )
#' saveRDS.lgb.Booster(model, "model.rds")
#' @export
Expand Down
18 changes: 9 additions & 9 deletions R-package/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,12 +116,12 @@ You may also read [Microsoft/LightGBM#912](https://github.com/microsoft/LightGBM
Examples
--------

Please visit [demo](demo):

* [Basic walkthrough of wrappers](demo/basic_walkthrough.R)
* [Boosting from existing prediction](demo/boost_from_prediction.R)
* [Early Stopping](demo/early_stopping.R)
* [Cross Validation](demo/cross_validation.R)
* [Multiclass Training/Prediction](demo/multiclass.R)
* [Leaf (in)Stability](demo/leaf_stability.R)
* [Weight-Parameter Adjustment Relationship](demo/weight_param.R)
Please visit [demo](https://github.com/microsoft/LightGBM/tree/master/R-package/demo):

* [Basic walkthrough of wrappers](https://github.com/microsoft/LightGBM/blob/master/R-package/demo/basic_walkthrough.R)
* [Boosting from existing prediction](https://github.com/microsoft/LightGBM/blob/master/R-package/demo/boost_from_prediction.R)
* [Early Stopping](https://github.com/microsoft/LightGBM/blob/master/R-package/demo/early_stopping.R)
* [Cross Validation](https://github.com/microsoft/LightGBM/blob/master/R-package/demo/cross_validation.R)
* [Multiclass Training/Prediction](https://github.com/microsoft/LightGBM/blob/master/R-package/demo/multiclass.R)
* [Leaf (in)Stability](https://github.com/microsoft/LightGBM/blob/master/R-package/demo/leaf_stability.R)
* [Weight-Parameter Adjustment Relationship](https://github.com/microsoft/LightGBM/blob/master/R-package/demo/weight_param.R)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are all of these being pointed to the repo? demo is a special thing in R packages, not something added by convention.

From writing R extensions

The demo subdirectory is for R scripts (for running via demo()) that demonstrate some of the functionality of the package. Demos may be interactive and are not checked automatically, so if testing is desired use code in the tests directory to achieve this. The script files must start with a (lower or upper case) letter and have one of the extensions .R or .r. If present, the demo subdirectory should also have a 00Index file with one line for each demo, giving its name and a description separated by a tab or at least three spaces. (This index file is not generated automatically.) Note that a demo does not have a specified encoding and so should be an ASCII file (see Encoding issues). Function demo() will use the package encoding if there is one, but this is mainly useful for non-ASCII comments.

I recommend reverting this.

Copy link
Collaborator Author

@StrikerRUS StrikerRUS May 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb Thanks for catching this. Indeed, this should be reverted, but after we start to generate demos on RTD site. At present, due to that the demo part is not built by pkgdown and it converts README to index.html (it can be modified by creating INDEX.md file in the repo, BTW), these links lead to non-existent pages on RTD. So, we should choose now: relative links in the repo and broken links on RTD, or absolute working links to GitHub.

# # to-do 
# build_articles(preview = FALSE) 
# build_tutorials(preview = FALSE)

https://lightgbm.readthedocs.io/en/docs/R/index.html#examples

image

Loading