Xgboost giving different results on mac and ubuntu #8834

alishametkari · 2023-02-23T18:35:38Z

I am using xgboost version 1.5.0.2 on mac and ubuntu machine. On both the machines I am getting different predictions using xgb.train() for time series forecasting problem. On mac I am getting acceptable prediction in correct range but on ubuntu very weird prediction observed which is almost a straight line. And there is huge difference between these two predictions. Why such difference? I want to get same prediction as mac on linux. How can I get it? Can anyone help me?

R version - 3.6.3
Xgboost version - 1.5.0.2

trivialfis · 2023-02-23T21:09:51Z

Could you please share a reproducible example?

alishametkari · 2023-02-24T11:02:34Z

Sure @trivialfis
Here it is.

testing_data_length <- nrow(test_x)
train_pred <- rep(NaN, nrow(train_x))
test_pred <- rep(NaN, testing_data_length)
params <- list(valid_sample_len = 5, cols_to_drop = c(), seed = 2017, nthread = 1,
nrounds = 2000, early_stopping_rounds = 500, eval_metric = 'rmse', objective = "reg:linear",
booster = "gbtree", eta = 0.1, subsample = 0.5, colsample_bytree = 0.5)
hyper_params <- list(max_depth = 4, enable_shap_values = FALSE, enable_variable_importance_values = FALSE)

define hyperparameters

validation_sample_len = params$valid_sample_len
train_data_len = nrow(train_x)
train_period_end = train_data_len - validation_sample_len
set.seed(params$seed)
features <- sort(colnames(train_x))
if (length(params$cols_to_drop) > 0) {
features <- features[!features %in% params$cols_to_drop]
}
dsample <- xgb.DMatrix(data.matrix(train_x[, features]), missing = NA)

prepare validation dataset

dval <- data.matrix(train_x[train_period_end:train_data_len, features])
dval <- xgb.DMatrix(data = dval, label = data.matrix(train_y[train_period_end:train_data_len]), missing = NA)
watchlist_dval <- list(dval = dval)

prepare training dataset

save the column names

dtrain = data.matrix(train_x[start:train_data_len, features])
dtrain = xgb.DMatrix(data = dtrain, label = data.matrix(train_y[start:train_data_len]), missing = NA)

prepare test dataset

dtest = data.matrix(test_x[, features])
dtest = xgb.DMatrix(dtest, missing = NA)

train model

xgb_model <- xgb.train(params = hyper_params,
data = dtrain,
nrounds = params$nrounds,
verbose = 0,
print_every_n = 5,
early_stopping_rounds = params$early_stopping_rounds,
eval_metric = params$eval_metric,
nthread = params$nthread,
watchlist = watchlist_dval,
maximize = FALSE)

forecast on test data

test_pred <- predict(xgb_model, dtest)
test_pred <- ifelse(test_pred < 0, 0, test_pred)

forecast on entire dataset

train_pred <- predict(xgb_model, dsample)
train_pred <- ifelse(train_pred < 0, 0, train_pred )

out = data.frame(forecast=c(train_pred, test_pred))

alishametkari · 2023-02-24T12:18:29Z

Is this happening due to differences in package version of other packages on which xgboost depends?
Can you tell me on which packages does xgboost depends for R?

trivialfis · 2023-02-25T19:55:31Z

Let me take a closer look, have been working on #8822 for a while and trying to switch back to normal maintenance work.

trivialfis · 2023-02-25T22:01:09Z

apologies, I can't debug the code you shared without a dataset. Would be great if you can share something I can run, maybe with a pseudo dataset. I'm asking since we run tests on multiple machines (see our CI runs on PRs) and the results are consistent. The issue you are describing is new to me and I can't guess the reason without actually reproducing it.

trivialfis · 2023-08-09T07:47:31Z

Closing due to stalled.

alishametkari changed the title ~~Xgboost giving different results on mac and linux~~ Xgboost giving different results on mac and ubuntu Feb 23, 2023

trivialfis added the status: need update label Mar 17, 2023

trivialfis closed this as completed Aug 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xgboost giving different results on mac and ubuntu #8834

Xgboost giving different results on mac and ubuntu #8834

alishametkari commented Feb 23, 2023

trivialfis commented Feb 23, 2023

alishametkari commented Feb 24, 2023

alishametkari commented Feb 24, 2023

trivialfis commented Feb 25, 2023 •

edited

Loading

trivialfis commented Feb 25, 2023

trivialfis commented Aug 9, 2023

Xgboost giving different results on mac and ubuntu #8834

Xgboost giving different results on mac and ubuntu #8834

Comments

alishametkari commented Feb 23, 2023

trivialfis commented Feb 23, 2023

alishametkari commented Feb 24, 2023

define hyperparameters

prepare validation dataset

prepare training dataset

save the column names

prepare test dataset

train model

forecast on test data

forecast on entire dataset

alishametkari commented Feb 24, 2023

trivialfis commented Feb 25, 2023 • edited Loading

trivialfis commented Feb 25, 2023

trivialfis commented Aug 9, 2023

trivialfis commented Feb 25, 2023 •

edited

Loading