Skip to content

Commit

Permalink
Rework MAP and Pairwise for LTR. (#9075)
Browse files Browse the repository at this point in the history
  • Loading branch information
trivialfis authored Apr 27, 2023
1 parent 0e470ef commit e206b89
Show file tree
Hide file tree
Showing 19 changed files with 612 additions and 1,135 deletions.
1 change: 0 additions & 1 deletion R-package/src/Makevars.in
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ OBJECTS= \
$(PKGROOT)/src/objective/objective.o \
$(PKGROOT)/src/objective/regression_obj.o \
$(PKGROOT)/src/objective/multiclass_obj.o \
$(PKGROOT)/src/objective/rank_obj.o \
$(PKGROOT)/src/objective/lambdarank_obj.o \
$(PKGROOT)/src/objective/hinge.o \
$(PKGROOT)/src/objective/aft_obj.o \
Expand Down
1 change: 0 additions & 1 deletion R-package/src/Makevars.win
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ OBJECTS= \
$(PKGROOT)/src/objective/objective.o \
$(PKGROOT)/src/objective/regression_obj.o \
$(PKGROOT)/src/objective/multiclass_obj.o \
$(PKGROOT)/src/objective/rank_obj.o \
$(PKGROOT)/src/objective/lambdarank_obj.o \
$(PKGROOT)/src/objective/hinge.o \
$(PKGROOT)/src/objective/aft_obj.o \
Expand Down
18 changes: 14 additions & 4 deletions doc/model.schema
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,16 @@
"num_pairsample": { "type": "string" },
"fix_list_weight": { "type": "string" }
}
},
"lambdarank_param": {
"type": "object",
"properties": {
"lambdarank_num_pair_per_sample": { "type": "string" },
"lambdarank_pair_method": { "type": "string" },
"lambdarank_unbiased": {"type": "string" },
"lambdarank_bias_norm": {"type": "string" },
"ndcg_exp_gain": {"type": "string"}
}
}
},
"type": "object",
Expand Down Expand Up @@ -477,22 +487,22 @@
"type": "object",
"properties": {
"name": { "const": "rank:pairwise" },
"lambda_rank_param": { "$ref": "#/definitions/lambda_rank_param"}
"lambda_rank_param": { "$ref": "#/definitions/lambdarank_param"}
},
"required": [
"name",
"lambda_rank_param"
"lambdarank_param"
]
},
{
"type": "object",
"properties": {
"name": { "const": "rank:ndcg" },
"lambda_rank_param": { "$ref": "#/definitions/lambda_rank_param"}
"lambda_rank_param": { "$ref": "#/definitions/lambdarank_param"}
},
"required": [
"name",
"lambda_rank_param"
"lambdarank_param"
]
},
{
Expand Down
43 changes: 37 additions & 6 deletions doc/parameter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ Parameters for Tree Booster
.. note:: This parameter is working-in-progress.

- The strategy used for training multi-target models, including multi-target regression
and multi-class classification. See :doc:`/tutorials/multioutput` for more information.
and multi-class classification. See :doc:`/tutorials/multioutput` for more information.

- ``one_output_per_tree``: One model for each target.
- ``multi_output_tree``: Use multi-target trees.
Expand Down Expand Up @@ -380,9 +380,9 @@ Specify the learning task and the corresponding learning objective. The objectiv
See :doc:`/tutorials/aft_survival_analysis` for details.
- ``multi:softmax``: set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)
- ``multi:softprob``: same as softmax, but output a vector of ``ndata * nclass``, which can be further reshaped to ``ndata * nclass`` matrix. The result contains predicted probability of each data point belonging to each class.
- ``rank:pairwise``: Use LambdaMART to perform pairwise ranking where the pairwise loss is minimized
- ``rank:ndcg``: Use LambdaMART to perform list-wise ranking where `Normalized Discounted Cumulative Gain (NDCG) <http://en.wikipedia.org/wiki/NDCG>`_ is maximized
- ``rank:map``: Use LambdaMART to perform list-wise ranking where `Mean Average Precision (MAP) <http://en.wikipedia.org/wiki/Mean_average_precision#Mean_average_precision>`_ is maximized
- ``rank:ndcg``: Use LambdaMART to perform pair-wise ranking where `Normalized Discounted Cumulative Gain (NDCG) <http://en.wikipedia.org/wiki/NDCG>`_ is maximized. This objective supports position debiasing for click data.
- ``rank:map``: Use LambdaMART to perform pair-wise ranking where `Mean Average Precision (MAP) <http://en.wikipedia.org/wiki/Mean_average_precision#Mean_average_precision>`_ is maximized
- ``rank:pairwise``: Use LambdaRank to perform pair-wise ranking using the `ranknet` objective.
- ``reg:gamma``: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be `gamma-distributed <https://en.wikipedia.org/wiki/Gamma_distribution#Occurrence_and_applications>`_.
- ``reg:tweedie``: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be `Tweedie-distributed <https://en.wikipedia.org/wiki/Tweedie_distribution#Occurrence_and_applications>`_.

Expand All @@ -395,8 +395,9 @@ Specify the learning task and the corresponding learning objective. The objectiv

* ``eval_metric`` [default according to objective]

- Evaluation metrics for validation data, a default metric will be assigned according to objective (rmse for regression, and logloss for classification, mean average precision for ranking)
- User can add multiple evaluation metrics. Python users: remember to pass the metrics in as list of parameters pairs instead of map, so that latter ``eval_metric`` won't override previous one
- Evaluation metrics for validation data, a default metric will be assigned according to objective (rmse for regression, and logloss for classification, `mean average precision` for ``rank:map``, etc.)
- User can add multiple evaluation metrics. Python users: remember to pass the metrics in as list of parameters pairs instead of map, so that latter ``eval_metric`` won't override previous ones

- The choices are listed below:

- ``rmse``: `root mean square error <http://en.wikipedia.org/wiki/Root_mean_square_error>`_
Expand Down Expand Up @@ -480,6 +481,36 @@ Parameter for using AFT Survival Loss (``survival:aft``) and Negative Log Likeli

* ``aft_loss_distribution``: Probability Density Function, ``normal``, ``logistic``, or ``extreme``.

.. _ltr-param:

Parameters for learning to rank (``rank:ndcg``, ``rank:map``, ``rank:pairwise``)
================================================================================

These are parameters specific to learning to rank task. See :doc:`Learning to Rank </tutorials/learning_to_rank>` for an in-depth explanation.

* ``lambdarank_pair_method`` [default = ``mean``]

How to construct pairs for pair-wise learning.

- ``mean``: Sample ``lambdarank_num_pair_per_sample`` pairs for each document in the query list.
- ``topk``: Focus on top-``lambdarank_num_pair_per_sample`` documents. Construct :math:`|query|` pairs for each document at the top-``lambdarank_num_pair_per_sample`` ranked by the model.

* ``lambdarank_num_pair_per_sample`` [range = :math:`[1, \infty]`]

It specifies the number of pairs sampled for each document when pair method is ``mean``, or the truncation level for queries when the pair method is ``topk``. For example, to train with ``ndcg@6``, set ``lambdarank_num_pair_per_sample`` to :math:`6` and ``lambdarank_pair_method`` to ``topk``.

* ``lambdarank_unbiased`` [default = ``false``]

Specify whether do we need to debias input click data.

* ``lambdarank_bias_norm`` [default = 2.0]

:math:`L_p` normalization for position debiasing, default is :math:`L_2`. Only relevant when ``lambdarank_unbiased`` is set to true.

* ``ndcg_exp_gain`` [default = ``true``]

Whether we should use exponential gain function for ``NDCG``. There are two forms of gain function for ``NDCG``, one is using relevance value directly while the other is using :math:`2^{rel} - 1` to emphasize on retrieving relevant documents. When ``ndcg_exp_gain`` is true (the default), relevance degree cannot be greater than 31.

***********************
Command Line Parameters
***********************
Expand Down
7 changes: 5 additions & 2 deletions python-package/xgboost/testing/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -431,8 +431,11 @@ def make_ltr(
"""Make a dataset for testing LTR."""
rng = np.random.default_rng(1994)
X = rng.normal(0, 1.0, size=n_samples * n_features).reshape(n_samples, n_features)
y = rng.integers(0, max_rel, size=n_samples)
qid = rng.integers(0, n_query_groups, size=n_samples)
y = np.sum(X, axis=1)
y -= y.min()
y = np.round(y / y.max() * max_rel).astype(np.int32)

qid = rng.integers(0, n_query_groups, size=n_samples, dtype=np.int32)
w = rng.normal(0, 1.0, size=n_query_groups)
w -= np.min(w)
w /= np.max(w)
Expand Down
1 change: 0 additions & 1 deletion src/metric/rank_metric.cc
Original file line number Diff line number Diff line change
Expand Up @@ -493,7 +493,6 @@ class EvalMAPScore : public EvalRankWithCache<ltr::MAPCache> {
auto rank_idx = p_cache->SortedIdx(ctx_, predt.ConstHostSpan());

common::ParallelFor(p_cache->Groups(), ctx_->Threads(), [&](auto g) {
auto g_predt = h_predt.Slice(linalg::Range(gptr[g], gptr[g + 1]));
auto g_label = h_label.Slice(linalg::Range(gptr[g], gptr[g + 1]));
auto g_rank = rank_idx.subspan(gptr[g]);

Expand Down
193 changes: 193 additions & 0 deletions src/objective/lambdarank_obj.cc
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ void LambdaRankUpdatePositionBias(Context const* ctx, linalg::VectorView<double
lj(i) += g_lj(i);
}
}

// The ti+ is not guaranteed to decrease since it depends on the |\delta Z|
//
// The update normalizes the ti+ to make ti+(0) equal to 1, which breaks the probability
Expand Down Expand Up @@ -432,9 +433,201 @@ void LambdaRankUpdatePositionBias(Context const*, linalg::VectorView<double cons
#endif // !defined(XGBOOST_USE_CUDA)
} // namespace cuda_impl

namespace cpu_impl {
void MAPStat(Context const* ctx, linalg::VectorView<float const> label,
common::Span<std::size_t const> rank_idx, std::shared_ptr<ltr::MAPCache> p_cache) {
auto h_n_rel = p_cache->NumRelevant(ctx);
auto gptr = p_cache->DataGroupPtr(ctx);

CHECK_EQ(h_n_rel.size(), gptr.back());
CHECK_EQ(h_n_rel.size(), label.Size());

auto h_acc = p_cache->Acc(ctx);

common::ParallelFor(p_cache->Groups(), ctx->Threads(), [&](auto g) {
auto cnt = gptr[g + 1] - gptr[g];
auto g_n_rel = h_n_rel.subspan(gptr[g], cnt);
auto g_rank = rank_idx.subspan(gptr[g], cnt);
auto g_label = label.Slice(linalg::Range(gptr[g], gptr[g + 1]));

// The number of relevant documents at each position
g_n_rel[0] = g_label(g_rank[0]);
for (std::size_t k = 1; k < g_rank.size(); ++k) {
g_n_rel[k] = g_n_rel[k - 1] + g_label(g_rank[k]);
}

// \sum l_k/k
auto g_acc = h_acc.subspan(gptr[g], cnt);
g_acc[0] = g_label(g_rank[0]) / 1.0;

for (std::size_t k = 1; k < g_rank.size(); ++k) {
g_acc[k] = g_acc[k - 1] + (g_label(g_rank[k]) / static_cast<double>(k + 1));
}
});
}
} // namespace cpu_impl

class LambdaRankMAP : public LambdaRankObj<LambdaRankMAP, ltr::MAPCache> {
public:
void GetGradientImpl(std::int32_t iter, const HostDeviceVector<float>& predt,
const MetaInfo& info, HostDeviceVector<GradientPair>* out_gpair) {
CHECK(param_.ndcg_exp_gain) << "NDCG gain can not be set for the MAP objective.";
if (ctx_->IsCUDA()) {
return cuda_impl::LambdaRankGetGradientMAP(
ctx_, iter, predt, info, GetCache(), ti_plus_.View(ctx_->gpu_id),
tj_minus_.View(ctx_->gpu_id), li_full_.View(ctx_->gpu_id), lj_full_.View(ctx_->gpu_id),
out_gpair);
}

auto gptr = p_cache_->DataGroupPtr(ctx_).data();
bst_group_t n_groups = p_cache_->Groups();

out_gpair->Resize(info.num_row_);
auto h_gpair = out_gpair->HostSpan();
auto h_label = info.labels.HostView().Slice(linalg::All(), 0);
auto h_predt = predt.ConstHostSpan();
auto rank_idx = p_cache_->SortedIdx(ctx_, h_predt);
auto h_weight = common::MakeOptionalWeights(ctx_, info.weights_);

auto make_range = [&](bst_group_t g) { return linalg::Range(gptr[g], gptr[g + 1]); };

cpu_impl::MAPStat(ctx_, h_label, rank_idx, GetCache());
auto n_rel = GetCache()->NumRelevant(ctx_);
auto acc = GetCache()->Acc(ctx_);

auto delta_map = [&](auto y_high, auto y_low, std::size_t rank_high, std::size_t rank_low,
bst_group_t g) {
if (rank_high > rank_low) {
std::swap(rank_high, rank_low);
std::swap(y_high, y_low);
}
auto cnt = gptr[g + 1] - gptr[g];
// In a hot loop
auto g_n_rel = common::Span<double const>{n_rel.data() + gptr[g], cnt};
auto g_acc = common::Span<double const>{acc.data() + gptr[g], cnt};
auto d = DeltaMAP(y_high, y_low, rank_high, rank_low, g_n_rel, g_acc);
return d;
};
using D = decltype(delta_map);

common::ParallelFor(n_groups, ctx_->Threads(), [&](auto g) {
auto cnt = gptr[g + 1] - gptr[g];
auto w = h_weight[g];
auto g_predt = h_predt.subspan(gptr[g], cnt);
auto g_gpair = h_gpair.subspan(gptr[g], cnt);
auto g_label = h_label.Slice(make_range(g));
auto g_rank = rank_idx.subspan(gptr[g], cnt);

auto args = std::make_tuple(this, iter, g_predt, g_label, w, g_rank, g, delta_map, g_gpair);

if (param_.lambdarank_unbiased) {
std::apply(&LambdaRankMAP::CalcLambdaForGroup<true, D>, args);
} else {
std::apply(&LambdaRankMAP::CalcLambdaForGroup<false, D>, args);
}
});
}
static char const* Name() { return "rank:map"; }
[[nodiscard]] const char* DefaultEvalMetric() const override {
return this->RankEvalMetric("map");
}
};

#if !defined(XGBOOST_USE_CUDA)
namespace cuda_impl {
void MAPStat(Context const*, MetaInfo const&, common::Span<std::size_t const>,
std::shared_ptr<ltr::MAPCache>) {
common::AssertGPUSupport();
}

void LambdaRankGetGradientMAP(Context const*, std::int32_t, HostDeviceVector<float> const&,
const MetaInfo&, std::shared_ptr<ltr::MAPCache>,
linalg::VectorView<double const>, // input bias ratio
linalg::VectorView<double const>, // input bias ratio
linalg::VectorView<double>, linalg::VectorView<double>,
HostDeviceVector<GradientPair>*) {
common::AssertGPUSupport();
}
} // namespace cuda_impl
#endif // !defined(XGBOOST_USE_CUDA)

/**
* \brief The RankNet loss.
*/
class LambdaRankPairwise : public LambdaRankObj<LambdaRankPairwise, ltr::RankingCache> {
public:
void GetGradientImpl(std::int32_t iter, const HostDeviceVector<float>& predt,
const MetaInfo& info, HostDeviceVector<GradientPair>* out_gpair) {
CHECK(param_.ndcg_exp_gain) << "NDCG gain can not be set for the pairwise objective.";
if (ctx_->IsCUDA()) {
return cuda_impl::LambdaRankGetGradientPairwise(
ctx_, iter, predt, info, GetCache(), ti_plus_.View(ctx_->gpu_id),
tj_minus_.View(ctx_->gpu_id), li_full_.View(ctx_->gpu_id), lj_full_.View(ctx_->gpu_id),
out_gpair);
}

auto gptr = p_cache_->DataGroupPtr(ctx_);
bst_group_t n_groups = p_cache_->Groups();

out_gpair->Resize(info.num_row_);
auto h_gpair = out_gpair->HostSpan();
auto h_label = info.labels.HostView().Slice(linalg::All(), 0);
auto h_predt = predt.ConstHostSpan();
auto h_weight = common::MakeOptionalWeights(ctx_, info.weights_);

auto make_range = [&](bst_group_t g) { return linalg::Range(gptr[g], gptr[g + 1]); };
auto rank_idx = p_cache_->SortedIdx(ctx_, h_predt);

auto delta = [](auto...) { return 1.0; };
using D = decltype(delta);

common::ParallelFor(n_groups, ctx_->Threads(), [&](auto g) {
auto cnt = gptr[g + 1] - gptr[g];
auto w = h_weight[g];
auto g_predt = h_predt.subspan(gptr[g], cnt);
auto g_gpair = h_gpair.subspan(gptr[g], cnt);
auto g_label = h_label.Slice(make_range(g));
auto g_rank = rank_idx.subspan(gptr[g], cnt);

auto args = std::make_tuple(this, iter, g_predt, g_label, w, g_rank, g, delta, g_gpair);
if (param_.lambdarank_unbiased) {
std::apply(&LambdaRankPairwise::CalcLambdaForGroup<true, D>, args);
} else {
std::apply(&LambdaRankPairwise::CalcLambdaForGroup<false, D>, args);
}
});
}

static char const* Name() { return "rank:pairwise"; }
[[nodiscard]] const char* DefaultEvalMetric() const override {
return this->RankEvalMetric("ndcg");
}
};

#if !defined(XGBOOST_USE_CUDA)
namespace cuda_impl {
void LambdaRankGetGradientPairwise(Context const*, std::int32_t, HostDeviceVector<float> const&,
const MetaInfo&, std::shared_ptr<ltr::RankingCache>,
linalg::VectorView<double const>, // input bias ratio
linalg::VectorView<double const>, // input bias ratio
linalg::VectorView<double>, linalg::VectorView<double>,
HostDeviceVector<GradientPair>*) {
common::AssertGPUSupport();
}
} // namespace cuda_impl
#endif // !defined(XGBOOST_USE_CUDA)

XGBOOST_REGISTER_OBJECTIVE(LambdaRankNDCG, LambdaRankNDCG::Name())
.describe("LambdaRank with NDCG loss as objective")
.set_body([]() { return new LambdaRankNDCG{}; });

XGBOOST_REGISTER_OBJECTIVE(LambdaRankPairwise, LambdaRankPairwise::Name())
.describe("LambdaRank with RankNet loss as objective")
.set_body([]() { return new LambdaRankPairwise{}; });

XGBOOST_REGISTER_OBJECTIVE(LambdaRankMAP, LambdaRankMAP::Name())
.describe("LambdaRank with MAP loss as objective.")
.set_body([]() { return new LambdaRankMAP{}; });

DMLC_REGISTRY_FILE_TAG(lambdarank_obj);
} // namespace xgboost::obj
Loading

0 comments on commit e206b89

Please sign in to comment.