Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] adding routine registration in R package (fixes #1910) #2911

Merged
merged 28 commits into from
Apr 1, 2020

Conversation

jameslamb
Copy link
Collaborator

@jameslamb jameslamb commented Mar 14, 2020

NOTE: This PR is a work in progress. I branched off of #2901 . The diff for this PR will get smaller once that PR is merged.

Now that #2901 has been merged, this PR is ready for review!

This fixes this R CMD CHECK note:

File ‘lightgbm/libs/lib_lightgbm.so’:
  Found no calls to: ‘R_registerRoutines’, ‘R_useDynamicSymbols’

It is good practice to register native routines and to disable symbol
search.

See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.

Changes in this PR:

  • Changed CMakeLists.txt to not even compile lightgbm_R.cpp in builds not meant for the R package. Non-R users will get slightly faster builds / smaller libraries, and we don't have to annotate different parts of lightgbm_R.cpp with #ifdef LGB_R_BUILD
  • Added calls to R_registerRoutines and R_useDynamicSymbols

I was very pleasantly surprised to learn that we could address this without changing LGBM_SE to SEXP everywhere 😀 . Maybe we'll decide to do that in the future to deal with whatever changes R 4.0.0 introduces.

All the entries in R_CallMethodDef CallEntries were created with this Python script. I don't think it should be checked into the repo, as future updates to this should be small, such as those we'd need to do for #2850 .

get-signatures.py (click me)
IN_FILE = 'include/LightGBM/lightgbm_R.h'
with open(IN_FILE, 'r') as f:
    txt = [
        line.strip() for line in f
    ]

# filter lines
txt = [
    line for line in txt
    if (
        not line.startswith('/')
        and not line.startswith('*')
        and not line.startswith('#')
    )
]

prefix = 'LIGHTGBM_C_EXPORT LGBM_SE '
functions = {}
current_function = None
for line in txt:
    if line.startswith(prefix):
        current_function = line.replace(prefix, '').replace('(', '')
        functions[current_function] = 0
        print(current_function)
    elif line.startswith('LGBM_SE'):
        functions[current_function] += 1
    elif line.startswith(');'):
        current_function = None
    else:
        raise RuntimeError(f"Encountered unknown line: '{line}'")

# create output like
# {"LGBM_BoosterSaveModelToString_R", (DL_FUNC) &LGBM_BoosterSaveModelToString_R, 6},
# {"LGBM_BoosterDumpModel_R", (DL_FUNC) &LGBM_BoosterDumpModel_R, 6}
longest_func_length = max([
    len(function_name)
    for function_name in functions.keys()
])
print(f"The longest function is {longest_func_length} characters")
for func_name, num_args in functions.items():
    out = '{"'
    out += func_name
    out += '"'
    out += " " * (longest_func_length - len(func_name))
    out += ', (DL_FUNC) &'
    out += func_name
    out += " " * (longest_func_length - len(func_name))
    out += ', '
    out += str(num_args)
    out += '},'
    print(out)

@jameslamb
Copy link
Collaborator Author

noticing a weird there where whenever I don't have a build_script block, AppVeyor does this:

image

@StrikerRUS
Copy link
Collaborator

@jameslamb

noticing a weird there where whenever I don't have a build_script block, AppVeyor does this:

To disable this you should have build: false at the top of your appveyor.yml file like here
https://github.com/RGF-team/rgf/blob/ba62e1f3bbba0b1c3e7b64b6e1aef5c5892e3b42/.appveyor.yml#L1

This causes AppVeyor to look for a Visual Studio project or solution file in the root directory of your project, and use that to do the build. The behavior of an MSBuild mode build can be controlled by the build: entry in appveyor.yml.
https://www.appveyor.com/docs/build-configuration/#build-mode

@jameslamb
Copy link
Collaborator Author

@jameslamb

noticing a weird there where whenever I don't have a build_script block, AppVeyor does this:

To disable this you should have build: false at the top of your appveyor.yml file like here
https://github.com/RGF-team/rgf/blob/ba62e1f3bbba0b1c3e7b64b6e1aef5c5892e3b42/.appveyor.yml#L1

This causes AppVeyor to look for a Visual Studio project or solution file in the root directory of your project, and use that to do the build. The behavior of an MSBuild mode build can be controlled by the build: entry in appveyor.yml.
https://www.appveyor.com/docs/build-configuration/#build-mode

hey thank you! And sorry, I didn't mean to post this on this PR and notify you. Meant to leave it as a note for myself on another PR I'm working on in my fork. Too many tabs open 😬

@jameslamb jameslamb force-pushed the feat/symbol-registration branch from 1b41a35 to 86f271d Compare March 24, 2020 01:03
@jameslamb jameslamb changed the title [WIP] [R-package] started adding routine registration in R package (fixes #1910) [R-package] started adding routine registration in R package (fixes #1910) Mar 24, 2020
@jameslamb jameslamb marked this pull request as ready for review March 24, 2020 01:28
@jameslamb
Copy link
Collaborator Author

This PR has been rebased to master

@jameslamb
Copy link
Collaborator Author

This PR has been rebased to master

With these changes plus the other changes merged in over the last few days, Rscript build_r.R produces a package with 0 ERRORS, 0 WARNINGS, and 0 NOTES* on macOS (with R 3.6.1) and two different versions of Ubuntu!

(0 / 0 / 0) my mac (`gcc`, macOS 10.14)
* using log directory ‘/Users/jlamb/repos/LightGBM/lightgbm.Rcheck’
* using R version 3.6.1 (2019-07-05)
* using platform: x86_64-apple-darwin15.6.0 (64-bit)
* using session charset: UTF-8
* using option ‘--as-cran’
* checking for file ‘lightgbm/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘lightgbm’ version ‘2.3.2’
* package encoding: UTF-8
* checking CRAN incoming feasibility ... NOTE
Maintainer: ‘Guolin Ke <[email protected]>’

New submission

License components with restrictions and base license permitting such:
  MIT + file LICENSE
File 'LICENSE':
  The MIT License (MIT)

  Copyright (c) Microsoft Corporation

  Permission is hereby granted, free of charge, to any person obtaining a copy
  of this software and associated documentation files (the "Software"), to deal
  in the Software without restriction, including without limitation the rights
  to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  copies of the Software, and to permit persons to whom the Software is
  furnished to do so, subject to the following conditions:

  The above copyright notice and this permission notice shall be included in all
  copies or substantial portions of the Software.

  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
  AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
  SOFTWARE.

The Date field is over a month old.
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking serialization versions ... OK
* checking whether package ‘lightgbm’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking for future file timestamps ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking use of S3 registration ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of ‘data’ directory ... OK
* checking data for non-ASCII characters ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking line endings in Makefiles ... OK
* checking compilation flags in Makevars ... OK
* checking for GNU extensions in Makefiles ... OK
* checking for portable use of $(BLAS_LIBS) and $(LAPACK_LIBS) ... OK
* checking use of SHLIB_OPENMP_*FLAGS in Makefiles ... OK
* checking pragmas in C/C++ headers and code ... OK
* checking compilation flags used ... OK
* checking compiled code ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘testthat.R’ [16s/14s]
 OK
* checking PDF version of manual ... OK
* DONE

Status: 1 NOTE

(0/0/0) Azure DevOps linux (gcc, Ubuntu 14.04) (link)
(0/0/0) Azure DevOps macOS (clang, macOS 10.14) (link)
(0/0/0) Travis linux (clang, Ubuntu 18.04) (link)
(0/0/0) Travis macOS (gcc, macOS 10.14) (link)

Some of these logs have 1 or two NOTEs, but they're not actually issues that will show up on CRAN. If you test a package that doesn't exist on CRAN today, you get a "hey this is a new package or new maintainer" note. The other one I see is about some Suggests dependencies not being available at testing time. Those won't be an issue on CRAN either because CRAN will install all your Suggests dependencies if they come from CRAN (which ours do)

@jameslamb jameslamb changed the title [R-package] started adding routine registration in R package (fixes #1910) [R-package] adding routine registration in R package (fixes #1910) Mar 24, 2020
@StrikerRUS
Copy link
Collaborator

This is awesome! 🎉

Also, I remember there are some NOTEs about the latest release date:

Maybe we should start counting NOTEs at CI and allow only 2?

@jameslamb
Copy link
Collaborator Author

Maybe we should start counting NOTEs at CI and allow only 2?

Good point about the other NOTEs, we may encounter those at some point. I agree, we should start to only allow 2 so we don't have things like #2928 pop back up.

From what I've seen so far on #2936 we have two other NOTEs that are Windows-specific that I'll fix in a followup PR (#2936 (comment)). I'm going to add these and some other comments on the current state to #629 since we're tracking progress there.

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we exclude lightgbm_R.cpp from VS solution?

CMakeLists.txt Outdated Show resolved Hide resolved
@jameslamb
Copy link
Collaborator Author

Should we exclude lightgbm_R.cpp from VS solution?

@StrikerRUS I think we should. I've been experimenting with MSBuild on my fork, trying to add something like the approach in the Python package but struggling with it. I think we should not try to build the R package with MSBuild (just sticking to MinGW or MSVC bundled with Visual Studio).

I'll remove it from those files.

@guolinke
Copy link
Collaborator

@jameslamb
I guess the MSBuild you refer to is the LightGBM.sln (VS solution) ?

@jameslamb
Copy link
Collaborator Author

@jameslamb
I guess the MSBuild you refer to is the LightGBM.sln (VS solution) ?

Yeah that's right. Like this approach in the Python package: https://github.com/microsoft/LightGBM/blob/master/python-package/setup.py#L135-L148

Sorry, not that familiar with MSBuild and Visual Studio so I might not be referring to those precisely.

CMakeLists.txt Outdated Show resolved Hide resolved
@jameslamb jameslamb force-pushed the feat/symbol-registration branch 2 times, most recently from f1d7604 to daa9da5 Compare March 25, 2020 22:39
@jameslamb
Copy link
Collaborator Author

Maybe we should start counting NOTEs at CI and allow only 2?

Good point about the other NOTEs, we may encounter those at some point. I agree, we should start to only allow 2 so we don't have things like #2928 pop back up.

From what I've seen so far on #2936 we have two other NOTEs that are Windows-specific that I'll fix in a followup PR (#2936 (comment)). I'm going to add these and some other comments on the current state to #629 since we're tracking progress there.

just rebased to master to pick up the changes from #2942 and changed the allowed number of notes from 3 to 2

@StrikerRUS
Copy link
Collaborator

@jameslamb

And agree with the suggestion to only link into _lightgbm.

Seems that corresponding commit was lost.

@jameslamb
Copy link
Collaborator Author

@jameslamb

And agree with the suggestion to only link into _lightgbm.

Seems that corresponding commit was lost.

ah! will fix it

@jameslamb jameslamb force-pushed the feat/symbol-registration branch from daa9da5 to 820ea9b Compare March 28, 2020 03:14
@jameslamb jameslamb force-pushed the feat/symbol-registration branch from 820ea9b to 1666f67 Compare April 1, 2020 15:55
@jameslamb
Copy link
Collaborator Author

Thanks for the reviews!

Going to merge since CI checks are passing and I think I've addressed all remaining comments. Now that this is merged, everyone building LightGBM in settings other than the R package is going to get faster builds and a smaller lib_lightgbm 😀

@jameslamb jameslamb merged commit 84bdf25 into microsoft:master Apr 1, 2020
@jameslamb jameslamb mentioned this pull request Apr 4, 2020
@jameslamb jameslamb deleted the feat/symbol-registration branch April 29, 2020 04:08
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants