Skip to content

[ci] run r-package Linux jobs in containers #5638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Jan 10, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
7cd6512
[ci] run r-package Linux jobs in containers
jameslamb Dec 16, 2022
3d87bd3
comment out CI
jameslamb Dec 16, 2022
d206222
remove extra dollar sign
jameslamb Dec 16, 2022
841577b
install git before checkout
jameslamb Dec 16, 2022
6653141
add certificates
jameslamb Dec 16, 2022
acbef20
install more stuff
jameslamb Dec 16, 2022
d67aa05
actually run tests
jameslamb Dec 16, 2022
ce5f897
use the IN_UBUNTU_LATEST_CONTAINER setup.sh stuff
jameslamb Dec 16, 2022
5a1df8c
fail earlier
jameslamb Dec 16, 2022
217fa7a
try changing apt repo
jameslamb Dec 16, 2022
3a6b7cf
add key-management stuff
jameslamb Dec 16, 2022
866072a
more packaging things
jameslamb Dec 16, 2022
327dfe0
fix locale
jameslamb Dec 16, 2022
9783583
testing
jameslamb Dec 16, 2022
1200bae
ensure we get the expected version of R
jameslamb Dec 16, 2022
951fd2b
ensure locale is set successfully
jameslamb Dec 20, 2022
dbfa42d
run old R on Ubuntu 18.04
jameslamb Dec 20, 2022
20ede2f
re-generate locale
jameslamb Dec 20, 2022
265931a
install newest version of git
jameslamb Dec 20, 2022
c59fe90
install add-apt-repository
jameslamb Dec 20, 2022
cbc4d32
move more stuff up prior to third-party actions
jameslamb Dec 20, 2022
587f070
re-organize
jameslamb Dec 21, 2022
be480a8
try to fix locale stuff
jameslamb Dec 21, 2022
c0d5106
set locale environment variables
jameslamb Dec 21, 2022
f11827d
add automake to get 'aclocal'
jameslamb Dec 21, 2022
5d96abc
test updating to newest cmake version
jameslamb Dec 21, 2022
fe82bac
install newest cmake
jameslamb Dec 21, 2022
d69bd00
skip license
jameslamb Dec 21, 2022
604f413
check R version, re-enable macOS jobs
jameslamb Dec 21, 2022
fa7fa85
*sighs in YAML*
jameslamb Dec 21, 2022
1009411
fix checks
jameslamb Dec 21, 2022
e256f8c
trust source dir
jameslamb Dec 21, 2022
6902822
revert encoding stuff that is now on master
jameslamb Dec 28, 2022
3c4e96e
env var
jameslamb Dec 28, 2022
4ba0cfb
merge master
jameslamb Dec 28, 2022
e6ba86a
restore CI jobs
jameslamb Dec 28, 2022
598a7d6
restore Windows R jobs
jameslamb Dec 28, 2022
5eeafa0
only install newer CMake on Ubuntu 18.04 builds
jameslamb Dec 28, 2022
4f2d268
Merge branch 'master' into ci/r-package-containers
jameslamb Dec 29, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
merge master
  • Loading branch information
jameslamb committed Dec 28, 2022
commit 4ba0cfb76da53e7aeec40a4e143d3f576de8bba6
5 changes: 5 additions & 0 deletions .ci/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,11 @@ elif [[ $OS_NAME == "linux" ]] && [[ $COMPILER == "clang" ]]; then
export CC=clang
fi

if [[ $IN_UBUNTU_BASE_CONTAINER == "true" ]]; then
export LANG="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"
fi

if [[ "${TASK}" == "r-package" ]] || [[ "${TASK}" == "r-rchk" ]]; then
bash ${BUILD_DIRECTORY}/.ci/test_r_package.sh || exit -1
exit 0
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/linkchecker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ env:
jobs:
check-links:
timeout-minutes: 60
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/optional_checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ on:
- release/*

jobs:
all-successful:
all-optional-checks-successful:
timeout-minutes: 120
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
Expand Down
140 changes: 71 additions & 69 deletions .github/workflows/r_package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -223,76 +223,78 @@ jobs:
$env:GITHUB_ACTIONS = "true"
$env:TASK = "${{ matrix.task }}"
& "$env:GITHUB_WORKSPACE/.ci/test_windows.ps1"
# test-r-sanitizers:
# name: r-sanitizers (ubuntu-latest, R-devel, ${{ matrix.compiler }} ASAN/UBSAN)
# timeout-minutes: 60
# runs-on: ubuntu-latest
# container: wch1/r-debug
# strategy:
# fail-fast: false
# matrix:
# include:
# - r_customization: san
# compiler: gcc
# - r_customization: csan
# compiler: clang
# steps:
# - name: Trust git cloning LightGBM
# run: |
# git config --global --add safe.directory "${GITHUB_WORKSPACE}"
# - name: Checkout repository
# uses: actions/checkout@v3
# with:
# fetch-depth: 5
# submodules: true
# - name: Install packages
# shell: bash
# run: |
# RDscript${{ matrix.r_customization }} -e "install.packages(c('R6', 'data.table', 'jsonlite', 'knitr', 'Matrix', 'RhpcBLASctl', 'rmarkdown', 'testthat'), repos = 'https://cran.rstudio.com', Ncpus = parallel::detectCores())"
# sh build-cran-package.sh --r-executable=RD${{ matrix.r_customization }}
# RD${{ matrix.r_customization }} CMD INSTALL lightgbm_*.tar.gz || exit -1
# - name: Run tests with sanitizers
# shell: bash
# run: |
# cd R-package/tests
# exit_code=0
# RDscript${{ matrix.r_customization }} testthat.R >> tests.log 2>&1 || exit_code=-1
# cat ./tests.log
# exit ${exit_code}
# test-r-debian-clang:
# name: r-package (debian, R-devel, clang)
# timeout-minutes: 60
# runs-on: ubuntu-latest
# container: rhub/debian-clang-devel
# steps:
# - name: Install Git before checkout
# shell: bash
# run: |
# apt-get update --allow-releaseinfo-change
# apt-get install --no-install-recommends -y git
# - name: Trust git cloning LightGBM
# run: |
# git config --global --add safe.directory "${GITHUB_WORKSPACE}"
# - name: Checkout repository
# uses: actions/checkout@v3
# with:
# fetch-depth: 5
# submodules: true
# - name: Install packages and run tests
# shell: bash
# run: |
# export PATH=/opt/R-devel/bin/:${PATH}
# Rscript -e "install.packages(c('R6', 'data.table', 'jsonlite', 'knitr', 'Matrix', 'RhpcBLASctl', 'rmarkdown', 'testthat'), repos = 'https://cran.rstudio.com', Ncpus = parallel::detectCores())"
# sh build-cran-package.sh
# R CMD check --as-cran --run-donttest lightgbm_*.tar.gz || exit -1
# if grep -q -E "NOTE|WARNING|ERROR" lightgbm.Rcheck/00check.log; then
# echo "NOTEs, WARNINGs, or ERRORs have been found by R CMD check"
# exit -1
# fi
all-successful:
# https://github.community/t/is-it-possible-to-require-all-github-actions-tasks-to-pass-without-enumerating-them/117957/4?u=graingert
# test-r-sanitizers:
# name: r-sanitizers (ubuntu-latest, R-devel, ${{ matrix.compiler }} ASAN/UBSAN)
# timeout-minutes: 60
# runs-on: ubuntu-latest
# container: wch1/r-debug
# strategy:
# fail-fast: false
# matrix:
# include:
# - r_customization: san
# compiler: gcc
# - r_customization: csan
# compiler: clang
# steps:
# - name: Trust git cloning LightGBM
# run: |
# git config --global --add safe.directory "${GITHUB_WORKSPACE}"
# - name: Checkout repository
# uses: actions/checkout@v3
# with:
# fetch-depth: 5
# submodules: true
# - name: Install packages
# shell: bash
# run: |
# RDscript${{ matrix.r_customization }} -e "install.packages(c('R6', 'data.table', 'jsonlite', 'knitr', 'Matrix', 'RhpcBLASctl', 'rmarkdown', 'testthat'), repos = 'https://cran.rstudio.com', Ncpus = parallel::detectCores())"
# sh build-cran-package.sh --r-executable=RD${{ matrix.r_customization }}
# RD${{ matrix.r_customization }} CMD INSTALL lightgbm_*.tar.gz || exit -1
# - name: Run tests with sanitizers
# shell: bash
# run: |
# cd R-package/tests
# exit_code=0
# RDscript${{ matrix.r_customization }} testthat.R >> tests.log 2>&1 || exit_code=-1
# cat ./tests.log
# exit ${exit_code}
# test-r-debian-clang:
# name: r-package (debian, R-devel, clang)
# timeout-minutes: 60
# runs-on: ubuntu-latest
# container: rhub/debian-clang-devel
# steps:
# - name: Install Git before checkout
# shell: bash
# run: |
# apt-get update --allow-releaseinfo-change
# apt-get install --no-install-recommends -y git
# - name: Trust git cloning LightGBM
# run: |
# git config --global --add safe.directory "${GITHUB_WORKSPACE}"
# - name: Checkout repository
# uses: actions/checkout@v3
# with:
# fetch-depth: 5
# submodules: true
# - name: Install packages and run tests
# shell: bash
# run: |
# export PATH=/opt/R-devel/bin/:${PATH}
# Rscript -e "install.packages(c('R6', 'data.table', 'jsonlite', 'knitr', 'Matrix', 'RhpcBLASctl', 'rmarkdown', 'testthat'), repos = 'https://cran.rstudio.com', Ncpus = parallel::detectCores())"
# sh build-cran-package.sh
# R CMD check --as-cran --run-donttest lightgbm_*.tar.gz || exit -1
# if grep -q -E "NOTE|WARNING|ERROR" lightgbm.Rcheck/00check.log; then
# echo "NOTEs, WARNINGs, or ERRORs have been found by R CMD check"
# exit -1
# fi
all-r-package-jobs-successful:
if: always()
runs-on: ubuntu-latest
needs: [test]
steps:
- name: Note that all tests succeeded
run: echo "🎉"
uses: re-actors/[email protected]
with:
jobs: ${{ toJSON(needs) }}
12 changes: 7 additions & 5 deletions .github/workflows/static_analysis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ jobs:
r-check-docs:
name: r-package-check-docs
timeout-minutes: 60
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
container: rocker/verse
steps:
- name: Trust git cloning LightGBM
Expand Down Expand Up @@ -80,10 +80,12 @@ jobs:
echo ""
exit -1
fi
all-successful:
# https://github.community/t/is-it-possible-to-require-all-github-actions-tasks-to-pass-without-enumerating-them/117957/4?u=graingert
runs-on: ubuntu-22.04
all-static-analysis-jobs-successful:
if: always()
runs-on: ubuntu-latest
needs: [test, r-check-docs]
steps:
- name: Note that all tests succeeded
run: echo "🎉"
uses: re-actors/[email protected]
with:
jobs: ${{ toJSON(needs) }}
22 changes: 22 additions & 0 deletions R-package/cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,31 @@
# CRAN Submission History

## v3.3.4 - Submission 1 - (December 15, 2022)

### CRAN response

Accepted to CRAN

### Maintainer Notes

Submitted with the following comment:

> This submission contains {lightgbm} 3.3.3.

> Per CRAN's policies, I am submitting it on behalf of the project's maintainer (Yu Shi), with his permission.

> This submission includes patches to address the following warnings observed on the fedora and debian CRAN checks.
>
> Compiled code should not call entry points which might terminate R nor write to stdout/stderr instead of to the console, nor use Fortran I/O nor system RNGs nor [v]sprintf.

> Thank you very much for your time and consideration.

## v3.3.3 - Submission 1 - (October 10, 2022)

### CRAN response

Accepted to CRAN

### Maintainer Notes

Submitted with the following comment:
Expand Down
1 change: 1 addition & 0 deletions R-package/src/Makevars.in
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ OBJECTS = \
boosting/gbdt_model_text.o \
boosting/gbdt_prediction.o \
boosting/prediction_early_stop.o \
boosting/sample_strategy.o \
io/bin.o \
io/config.o \
io/config_auto.o \
Expand Down
1 change: 1 addition & 0 deletions R-package/src/Makevars.win.in
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ OBJECTS = \
boosting/gbdt_model_text.o \
boosting/gbdt_prediction.o \
boosting/prediction_early_stop.o \
boosting/sample_strategy.o \
io/bin.o \
io/config.o \
io/config_auto.o \
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ MLflow (experiment tracking, model monitoring framework): https://github.com/mlf

lightgbm-transform (feature transformation binding): https://github.com/microsoft/lightgbm-transform

`postgresml` (LightGBM training and prediction in SQL, via a Postgres extension): https://github.com/postgresml/postgresml

Support
-------

Expand Down
2 changes: 1 addition & 1 deletion docs/Development-Guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Important Classes
+-------------------------+----------------------------------------------------------------------------------------+
| ``Bin`` | Data structure used for storing feature discrete values (converted from float values) |
+-------------------------+----------------------------------------------------------------------------------------+
| ``Boosting`` | Boosting interface (GBDT, DART, GOSS, etc.) |
| ``Boosting`` | Boosting interface (GBDT, DART, etc.) |
+-------------------------+----------------------------------------------------------------------------------------+
| ``Config`` | Stores parameters and configurations |
+-------------------------+----------------------------------------------------------------------------------------+
Expand Down
14 changes: 10 additions & 4 deletions docs/Parameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,18 +127,24 @@ Core Parameters

- label should be ``int`` type, and larger number represents the higher relevance (e.g. 0:bad, 1:fair, 2:good, 3:perfect)

- ``boosting`` :raw-html:`<a id="boosting" title="Permalink to this parameter" href="#boosting">&#x1F517;&#xFE0E;</a>`, default = ``gbdt``, type = enum, options: ``gbdt``, ``rf``, ``dart``, ``goss``, aliases: ``boosting_type``, ``boost``
- ``boosting`` :raw-html:`<a id="boosting" title="Permalink to this parameter" href="#boosting">&#x1F517;&#xFE0E;</a>`, default = ``gbdt``, type = enum, options: ``gbdt``, ``rf``, ``dart``, aliases: ``boosting_type``, ``boost``

- ``gbdt``, traditional Gradient Boosting Decision Tree, aliases: ``gbrt``

- ``rf``, Random Forest, aliases: ``random_forest``

- ``dart``, `Dropouts meet Multiple Additive Regression Trees <https://arxiv.org/abs/1505.01866>`__

- ``goss``, Gradient-based One-Side Sampling

- **Note**: internally, LightGBM uses ``gbdt`` mode for the first ``1 / learning_rate`` iterations

- ``data_sample_strategy`` :raw-html:`<a id="data_sample_strategy" title="Permalink to this parameter" href="#data_sample_strategy">&#x1F517;&#xFE0E;</a>`, default = ``bagging``, type = enum, options: ``bagging``, ``goss``

- ``bagging``, Randomly Bagging Sampling

- **Note**: ``bagging`` is only effective when ``bagging_freq > 0`` and ``bagging_fraction < 1.0``

- ``goss``, Gradient-based One-Side Sampling

- ``data`` :raw-html:`<a id="data" title="Permalink to this parameter" href="#data">&#x1F517;&#xFE0E;</a>`, default = ``""``, type = string, aliases: ``train``, ``train_data``, ``train_data_file``, ``data_filename``

- path of training data, LightGBM will train from this data
Expand Down Expand Up @@ -268,7 +274,7 @@ Learning Control Parameters

- ``num_threads`` is relatively small, e.g. ``<= 16``

- you want to use small ``bagging_fraction`` or ``goss`` boosting to speed up
- you want to use small ``bagging_fraction`` or ``goss`` sample strategy to speed up

- **Note**: setting this to ``true`` will double the memory cost for Dataset object. If you have not enough memory, you can try setting ``force_col_wise=true``

Expand Down
13 changes: 10 additions & 3 deletions include/LightGBM/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -153,14 +153,21 @@ struct Config {
// [doc-only]
// type = enum
// alias = boosting_type, boost
// options = gbdt, rf, dart, goss
// options = gbdt, rf, dart
// desc = ``gbdt``, traditional Gradient Boosting Decision Tree, aliases: ``gbrt``
// desc = ``rf``, Random Forest, aliases: ``random_forest``
// desc = ``dart``, `Dropouts meet Multiple Additive Regression Trees <https://arxiv.org/abs/1505.01866>`__
// desc = ``goss``, Gradient-based One-Side Sampling
// descl2 = **Note**: internally, LightGBM uses ``gbdt`` mode for the first ``1 / learning_rate`` iterations
std::string boosting = "gbdt";

// [doc-only]
// type = enum
// options = bagging, goss
// desc = ``bagging``, Randomly Bagging Sampling
// descl2 = **Note**: ``bagging`` is only effective when ``bagging_freq > 0`` and ``bagging_fraction < 1.0``
// desc = ``goss``, Gradient-based One-Side Sampling
std::string data_sample_strategy = "bagging";

// alias = train, train_data, train_data_file, data_filename
// desc = path of training data, LightGBM will train from this data
// desc = **Note**: can be used only in CLI version
Expand Down Expand Up @@ -263,7 +270,7 @@ struct Config {
// desc = enabling this is recommended when:
// descl2 = the number of data points is large, and the total number of bins is relatively small
// descl2 = ``num_threads`` is relatively small, e.g. ``<= 16``
// descl2 = you want to use small ``bagging_fraction`` or ``goss`` boosting to speed up
// descl2 = you want to use small ``bagging_fraction`` or ``goss`` sample strategy to speed up
// desc = **Note**: setting this to ``true`` will double the memory cost for Dataset object. If you have not enough memory, you can try setting ``force_col_wise=true``
// desc = **Note**: when both ``force_col_wise`` and ``force_row_wise`` are ``false``, LightGBM will firstly try them both, and then use the faster one. To remove the overhead of testing set the faster one to ``true`` manually
// desc = **Note**: this parameter cannot be used at the same time with ``force_col_wise``, choose only one of them
Expand Down
2 changes: 2 additions & 0 deletions include/LightGBM/cuda/cuda_tree.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@ class CUDATree : public Tree {
const data_size_t* used_data_indices,
data_size_t num_data, double* score) const override;

inline void AsConstantTree(double val) override;

const int* cuda_leaf_parent() const { return cuda_leaf_parent_; }

const int* cuda_left_child() const { return cuda_left_child_; }
Expand Down
6 changes: 5 additions & 1 deletion include/LightGBM/cuda/cuda_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@
#include <cuda.h>
#include <cuda_runtime.h>
#include <stdio.h>
#include <LightGBM/utils/log.h>
#endif // USE_CUDA || USE_CUDA_EXP

#ifdef USE_CUDA_EXP
#include <LightGBM/utils/log.h>
#include <vector>
#endif // USE_CUDA_EXP

Expand Down Expand Up @@ -119,8 +119,12 @@ class CUDAVector {
}

void Resize(size_t size) {
if (size == size_) {
return;
}
if (size == 0) {
Clear();
return;
}
T* new_data = nullptr;
AllocateCUDAMemory<T>(&new_data, size, __FILE__, __LINE__);
Expand Down
5 changes: 4 additions & 1 deletion include/LightGBM/objective_function.h
Original file line number Diff line number Diff line change
Expand Up @@ -101,9 +101,12 @@ class ObjectiveFunction {
/*!
* \brief Convert output for CUDA version
*/
const double* ConvertOutputCUDA(data_size_t /*num_data*/, const double* input, double* /*output*/) const {
virtual const double* ConvertOutputCUDA(data_size_t /*num_data*/, const double* input, double* /*output*/) const {
return input;
}

virtual bool NeedConvertOutputCUDA () const { return false; }

#endif // USE_CUDA_EXP
};

Expand Down
Loading
You are viewing a condensed version of this merge commit. You can view the full changes here.