[python-package] remove unnecessary files to reduce sdist size #3639
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Similar to #3579, this PR proposes making
python-package/MANIFEST.in
stricter to prevent unnecessary files from being bundled in the source distribution of the Python package.#3405 introduced two new submodules (
fmt
andfast_double_parser
), and right now all of their contents are being bundled in the source distribution oflightgbm
. That includes a lot of files that are unnecessary for LightGBM, like tests and documentation.This PR removes them. See #3579 for why this is worth caring about.
master
checking the size of the python package
You can run the script below,
./check-sizes.sh
, to calculate the size of the Python package.check-sizes.sh
Note that that script copies the contents of
lightgbm.egg-info/SOURCES.txt
to a file~/LIGHTGBM-SOURCES.txt
. Inspect that file to see a full list of everything included in thesdist
package. This is how I figured out what changes to make inMANIFEST.in
. For example, it showed thatfast_double_parser
's test data is in a.txt
file, so a rule matching*.txt
was including it.how LightGBM uses
fmt
andfast_double_parser
LightGBM only re-uses header files from these two libraries. Specifically, it only needs these files:
Notes for reviewers
LightGBM/build-cran-package.sh
Lines 31 to 38 in 6320f1d