Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpma: make gpspm server use seperate RCQ #272

Closed
wants to merge 1 commit into from
Closed

Conversation

yangx-jy
Copy link

@yangx-jy yangx-jy commented Feb 16, 2022

Signed-off-by: Xiao Yang [email protected]


This change is Reviewable

Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1, all commit messages.
Reviewable status: 1 of 3 files reviewed, all discussions resolved (waiting on @ldorau)

@grom72 grom72 requested a review from janekmi February 17, 2022 09:33
Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r1, all commit messages.
Reviewable status: 2 of 3 files reviewed, 4 unresolved discussions (waiting on @janekmi and @yangx-jy)


engines/librpma_gpspm.c, line 636 at r1 (raw file):

}

static int server_cmpl_process(struct thread_data *td, struct ibv_wc *wc, bool use_rcq)
static int server_cmpl_process(struct thread_data *td, bool use_rcq, struct ibv_wc *wc)

result in the last argument

Code quote:

static int server_cmpl_process(struct thread_data *td, struct ibv_wc *wc, bool use_rcq)

engines/librpma_gpspm.c, line 636 at r1 (raw file):

}

static int server_cmpl_process(struct thread_data *td, struct ibv_wc *wc, bool use_rcq)
struct rpma_cq *cq

Would it be better to keep server_cmpl_process simple/universal and move all Recv/Send specific handling to server_queue_process

Code quote:

bool use_rcq

engines/librpma_gpspm.c, line 700 at r1 (raw file):

		return ret;

	if (wc.opcode == IBV_WC_RECV) {

It is possible that wc will be initiated with random values and will not be modified by server_cmpl_process() if we do not have any RCQ entry.
Shall we find another way to check that wc is a valid one?
i.e.

wc.status = IBV_WC_GENERAL_ERR
ret = server_cmpl_process(td, &wc, 1);
if (ret)
	return ret;
if ( wc.status == IBV_WC_SUCCESS && wc.opcode == IBV_WC_RECV) {

or let server_cmpl_process() return 1 if there is something to process.

Suggestion:

	ret = server_cmpl_process(td, &wc, 1);
	if (ret)
		return ret;

	if (wc.opcode == IBV_WC_RECV) {

engines/librpma_gpspm.c, line 707 at r1 (raw file):

	/* process the send completion */
	ret = server_cmpl_process(td, &wc, 0);

NULL?
As we do not need this value at all.
or we use wc for additional validation

Code quote:

&wc

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 6 unresolved discussions (waiting on @janekmi and @yangx-jy)


engines/librpma_fio.h, line 256 at r1 (raw file):

	struct rpma_conn *conn;
	struct rpma_cq *cq;
	struct rpma_cq *rcq;

Should this be moved to struct server_data { ...} of GPSPM server?

Code quote:

struct rpma_cq *rcq;

engines/librpma_fio.c, line 1044 at r1 (raw file):

	}

	/* get the connection's receive CQ */

This is GPSPM specyfic code
Shall this be moved to server_open_file() in gpspm engine?

Code quote:

	/* get the connection's receive CQ */
	if ((ret = rpma_conn_get_rcq(csd->conn, &csd->rcq))) {
		librpma_td_verror(td, ret, "rpma_conn_get_rcq");
		goto err_conn_delete;
	}

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1.
Reviewable status: 2 of 3 files reviewed, 6 unresolved discussions (waiting on @janekmi and @yangx-jy)

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @janekmi and @yangx-jy)

Copy link

@osalyk osalyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r1, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @janekmi and @yangx-jy)

a discussion (no related file):
Unfortunately, this implementation does not work.
image.png


Copy link
Author

@yangx-jy yangx-jy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 7 unresolved discussions (waiting on @grom72, @janekmi, @ldorau, and @osalyk)

a discussion (no related file):

Previously, osalyk (Oksana Sałyk) wrote…

Unfortunately, this implementation does not work.
image.png

@osalyk

Sorry, I didn't reproduce the issue.
Could you test my new PR?

Thanks a lot



engines/librpma_gpspm.c, line 636 at r1 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…
static int server_cmpl_process(struct thread_data *td, bool use_rcq, struct ibv_wc *wc)

result in the last argument

Done. Deleted bool use_rcq argument.


engines/librpma_gpspm.c, line 636 at r1 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…
struct rpma_cq *cq

Would it be better to keep server_cmpl_process simple/universal and move all Recv/Send specific handling to server_queue_process

Done.


engines/librpma_gpspm.c, line 700 at r1 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…

It is possible that wc will be initiated with random values and will not be modified by server_cmpl_process() if we do not have any RCQ entry.
Shall we find another way to check that wc is a valid one?
i.e.

wc.status = IBV_WC_GENERAL_ERR
ret = server_cmpl_process(td, &wc, 1);
if (ret)
	return ret;
if ( wc.status == IBV_WC_SUCCESS && wc.opcode == IBV_WC_RECV) {

or let server_cmpl_process() return 1 if there is something to process.

Done. I call memset(wc, 0, sizeof(struct ibv_wc)) in server_cmpl_process(). In this case, I think wc.opcode == IBV_WC_RECV means a receive completion.


engines/librpma_gpspm.c, line 707 at r1 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…

NULL?
As we do not need this value at all.
or we use wc for additional validation

Done.


engines/librpma_fio.h, line 256 at r1 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…

Should this be moved to struct server_data { ...} of GPSPM server?

Done.


engines/librpma_fio.c, line 1044 at r1 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…

This is GPSPM specyfic code
Shall this be moved to server_open_file() in gpspm engine?

Done.

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r2, all commit messages.
Reviewable status: 2 of 3 files reviewed, 5 unresolved discussions (waiting on @grom72, @janekmi, and @osalyk)

Copy link

@osalyk osalyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 5 unresolved discussions (waiting on @grom72, @janekmi, and @yangx-jy)

a discussion (no related file):

Previously, yangx-jy (Xiao Yang) wrote…

@osalyk

Sorry, I didn't reproduce the issue.
Could you test my new PR?

Thanks a lot

Still not working.


Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 2 of 3 files reviewed, 5 unresolved discussions (waiting on @grom72, @janekmi, and @yangx-jy)

@yangx-jy
Copy link
Author

a discussion (no related file):

Previously, osalyk (Oksana Sałyk) wrote…

Still not working.

@osalyk

Is there any output?


Copy link

@janekmi janekmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r2, all commit messages.
Reviewable status: 2 of 3 files reviewed, 9 unresolved discussions (waiting on @grom72, @osalyk, and @yangx-jy)


engines/librpma_gpspm.c, line 520 at r2 (raw file):

	/*
	 * Calculate the required queue sizes where:

Without any mathematical symbol, it is not a calculation anymore. xD

My recommendation:

The required queue sizes are:

engines/librpma_gpspm.c, line 542 at r2 (raw file):

		goto err_cfg_delete;
	}

No newlines so far among these if statements. I think you might consider these four as a group. ;-)


engines/librpma_gpspm.c, line 658 at r2 (raw file):

	uint32_t qes_to_process = min(sd->msg_queued_nr, sd->msg_sqe_available);
	if (qes_to_process == 0)
		return 0;

The new implementation lacks security checks.

Note:

  • the server_cmpl_process() does not wait for a completion if o->busy_wait_polling != 0

So, you can easily end up in a situation where you have collected none of the SEND completions. On the other hand, appearing RECV completions forces you to put yet another SEND request to the SQ. Where you have no other guarantee that SQ can accept the next item except collecting CQEs from CQ.

Bottom Line

You have to be sure SQ can accept the next item before you will process another RECV request.


engines/librpma_gpspm.c, line 722 at r2 (raw file):

	return 0;
}

What is the reason behind changing the order of:

  • server_queue_process() and
  • server_cmpl_process()?

It makes the review harder.

@yangx-jy yangx-jy force-pushed the rpma branch 2 times, most recently from 6faeeeb to d32c819 Compare February 22, 2022 03:03
Copy link
Author

@yangx-jy yangx-jy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 7 unresolved discussions (waiting on @grom72, @janekmi, and @osalyk)


engines/librpma_gpspm.c, line 658 at r2 (raw file):

Previously, janekmi (Jan Michalski) wrote…

The new implementation lacks security checks.

Note:

  • the server_cmpl_process() does not wait for a completion if o->busy_wait_polling != 0

So, you can easily end up in a situation where you have collected none of the SEND completions. On the other hand, appearing RECV completions forces you to put yet another SEND request to the SQ. Where you have no other guarantee that SQ can accept the next item except collecting CQEs from CQ.

Bottom Line

You have to be sure SQ can accept the next item before you will process another RECV request.

Done. Ensure at least one SQ slot is available by original sd->msg_sqe_available.


engines/librpma_gpspm.c, line 722 at r2 (raw file):

Previously, janekmi (Jan Michalski) wrote…

What is the reason behind changing the order of:

  • server_queue_process() and
  • server_cmpl_process()?

It makes the review harder.

We have to call server_cmpl_process(csd->cq) after server_qe_process(). If not, server_cmpl_process(csd->cq) will be blocked when busy_wait_polling is set to 0.

@janekmi I' m glad to see you. ^_^

Copy link

@janekmi janekmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 7 unresolved discussions (waiting on @grom72, @janekmi, and @osalyk)

a discussion (no related file):
@yangx-jy I'm glad to see you too. :-)
I hope I won't slow down too much your review.

I'm here for sentimental reasons. I have created this software queue some time ago and at that time we came up with the idea to simplify this implementation with the introduction of SRQ. So, for me, it is a kind of closure.


Copy link

@janekmi janekmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 2 of 3 files reviewed, 7 unresolved discussions (waiting on @grom72, @janekmi, @osalyk, and @yangx-jy)


engines/librpma_gpspm.c, line 722 at r2 (raw file):

Previously, yangx-jy (Xiao Yang) wrote…

We have to call server_cmpl_process(csd->cq) after server_qe_process(). If not, server_cmpl_process(csd->cq) will be blocked when busy_wait_polling is set to 0.

@janekmi I' m glad to see you. ^_^

@yangx-jy I'm glad to see you too. :-)

I understand the reason. Would you mind solving this technical issue differently so it will be easier to follow the bigger architectural changes going on here? e.g.:

/* XXX to be removed when server_cmpl_process() and server_queue_process() will change places */
server_cmpl_process(); /* declaration only */

server_queue_process()
{
    /* ... */
    server_cmpl_process(); /* the problematic use */
    /* ... */
}

server_cmpl_process()
{
    /* ... */
}

Copy link
Author

@yangx-jy yangx-jy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 7 unresolved discussions (waiting on @grom72, @janekmi, and @osalyk)

a discussion (no related file):

Previously, janekmi (Jan Michalski) wrote…

@yangx-jy I'm glad to see you too. :-)
I hope I won't slow down too much your review.

I'm here for sentimental reasons. I have created this software queue some time ago and at that time we came up with the idea to simplify this implementation with the introduction of SRQ. So, for me, it is a kind of closure.

RCQ or SRQ?



engines/librpma_gpspm.c, line 722 at r2 (raw file):

Previously, janekmi (Jan Michalski) wrote…

@yangx-jy I'm glad to see you too. :-)

I understand the reason. Would you mind solving this technical issue differently so it will be easier to follow the bigger architectural changes going on here? e.g.:

/* XXX to be removed when server_cmpl_process() and server_queue_process() will change places */
server_cmpl_process(); /* declaration only */

server_queue_process()
{
    /* ... */
    server_cmpl_process(); /* the problematic use */
    /* ... */
}

server_cmpl_process()
{
    /* ... */
}

Done. It's my fault.

Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 2 of 3 files reviewed, 7 unresolved discussions (waiting on @grom72, @janekmi, and @osalyk)

@yangx-jy
Copy link
Author

a discussion (no related file):

Previously, yangx-jy (Xiao Yang) wrote…

@osalyk

Is there any output?

@ldorau @grom72 @osalyk

After reading the logic of gpspm client, it seems that a lot of common code are used by both gpspm client and apm client.
If gpspm client wants to use RCQ, I think we have to refactor a lot of common code and split them into librpma_gpspm.c and librpma_apm.c. Do you think so?


Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 7 unresolved discussions (waiting on @grom72 and @janekmi)

a discussion (no related file):

Previously, yangx-jy (Xiao Yang) wrote…

@ldorau @grom72 @osalyk

After reading the logic of gpspm client, it seems that a lot of common code are used by both gpspm client and apm client.
If gpspm client wants to use RCQ, I think we have to refactor a lot of common code and split them into librpma_gpspm.c and librpma_apm.c. Do you think so?

@yangx-jy Decreasing amount of duplicated code without complicating it is always a good idea ;-) So feel free to submit a separate pull request for that (please do not do it in the current one).


Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @grom72, @janekmi, and @yangx-jy)


engines/librpma_gpspm.c, line 65 at r4 (raw file):

static int client_get_io_u_index(struct ibv_wc *wc, unsigned int *io_u_index);

static int server_cmpl_process(struct thread_data *td, struct rpma_cq *cq,

Could you reverse the order of server_queue_process() and server_cmpl_process and remove this forward declaration?

Code quote:

static int server_cmpl_process(struct thread_data *td, struct rpma_cq *cq,
		struct ibv_wc *wc);

engines/librpma_gpspm.c, line 657 at r4 (raw file):

	struct librpma_fio_server_data *csd = td->io_ops_data;
	struct server_data *sd = csd->server_data;
	struct ibv_wc wc[2];

Is there any reason to use an array here?
It is a bit misleading for me, because - as far as I can see - it is not used as an array anywhere.

In my opinion, it would be more clear to use separate variables: wc_cq instead of wc[0] and wc_rcq instead of wc[1]:

	struct ibv_wc wc_cq;
	struct ibv_wc wc_rcq;
	int num_entries_got;

Code quote:

	struct ibv_wc wc[2];

engines/librpma_gpspm.c, line 661 at r4 (raw file):

	/* process the receive completion */
	ret = server_cmpl_process(td, sd->rcq, &wc[0]);

Please use an additional argument num_entries_got to make sure that a completion has been received:

	/* process the receive completion */
	ret = server_cmpl_process(td, sd->rcq, &wc_rcq, &num_entries_got);
	if (ret)
		return ret;

	if (num_entries_got && wc_rcq.opcode == IBV_WC_RECV) {

Code quote:

	ret = server_cmpl_process(td, sd->rcq, &wc[0]);
	if (ret)
		return ret;

	if (wc[0].opcode == IBV_WC_RECV) {

engines/librpma_gpspm.c, line 668 at r4 (raw file):

		while (!sd->msg_sqe_available) {
			/* ensure that at least one SQ slot is available */
			ret = server_cmpl_process(td, csd->cq, &wc[1]);
ret = server_cmpl_process(td, csd->cq, &wc_cq, NULL);

Code quote:

			ret = server_cmpl_process(td, csd->cq, &wc[1]);

engines/librpma_gpspm.c, line 678 at r4 (raw file):

		/* process the send completion */
		ret = server_cmpl_process(td, csd->cq, &wc[1]);
ret = server_cmpl_process(td, csd->cq, &wc_cq, NULL);

Code quote:

ret = server_cmpl_process(td, csd->cq, &wc[1]);

engines/librpma_gpspm.c, line 686 at r4 (raw file):

}

static int server_cmpl_process(struct thread_data *td, struct rpma_cq *cq, struct ibv_wc *wc)

Please add the num_entries_got argument to make sure a completion has been received:

static int server_cmpl_process(struct thread_data *td, struct rpma_cq *cq,
		struct ibv_wc *wc, int *num_entries_got)

Code quote:

static int server_cmpl_process(struct thread_data *td, struct rpma_cq *cq, struct ibv_wc *wc)

engines/librpma_gpspm.c, line 695 at r4 (raw file):

	memset(wc, 0, sizeof(struct ibv_wc));

	ret = rpma_cq_get_wc(cq, 1, wc, NULL);

It can be too expensive to call memset() in the critical path, so please remove it and and use rpma_cq_get_wc() with the num_entries_got argument (we already have the right mechanism in place):

	ret = rpma_cq_get_wc(cq, 1, wc, num_entries_got);

Code quote:

	memset(wc, 0, sizeof(struct ibv_wc));

	ret = rpma_cq_get_wc(cq, 1, wc, NULL);

engines/librpma_gpspm.c, line 707 at r4 (raw file):

			}

			ret = rpma_cq_get_wc(cq, 1, wc, NULL);

..., wc, num_entries_got);

Code quote:

rpma_cq_get_wc(cq, 1, wc, NULL);

Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @grom72, @janekmi, and @yangx-jy)


engines/librpma_gpspm.c, line 700 at r1 (raw file):

Previously, yangx-jy (Xiao Yang) wrote…

Done. I call memset(wc, 0, sizeof(struct ibv_wc)) in server_cmpl_process(). In this case, I think wc.opcode == IBV_WC_RECV means a receive completion.

It can be too expensive to call memset() in the critical path, so please remove it and and use rpma_cq_get_wc() with the num_entries_got argument (we already have the right mechanism in place).

Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @grom72, @janekmi, and @yangx-jy)


engines/librpma_gpspm.c, line 722 at r2 (raw file):

Previously, yangx-jy (Xiao Yang) wrote…

Done. It's my fault.

Please reverse the order of server_queue_process() and server_cmpl_process() and remove the forward declaration.

Copy link

@janekmi janekmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @janekmi, @ldorau, @osalyk, and @yangx-jy)

a discussion (no related file):

Previously, yangx-jy (Xiao Yang) wrote…

RCQ or SRQ?

RCQ obviously. ;-)



engines/librpma_gpspm.c, line 722 at r2 (raw file):

Previously, ldorau (Lukasz Dorau) wrote…

Please reverse the order of server_queue_process() and server_cmpl_process() and remove the forward declaration.

Sure it can be either way. Obviously, in the end, the forward declaration is not necessary. I assume you do not agree with my argumentation and want to skip directly to the final implementation with a single PR.

But it is just me guessing, maybe you can elaborate on what is your rationale so we can argue nicely, how about that? 😉

Copy link

@janekmi janekmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @ldorau, @osalyk, and @yangx-jy)


engines/librpma_gpspm.c, line 65 at r4 (raw file):

Previously, ldorau (Lukasz Dorau) wrote…

Could you reverse the order of server_queue_process() and server_cmpl_process and remove this forward declaration?

. (in this case, it is a dot of disagreement 😉)


engines/librpma_gpspm.c, line 554 at r4 (raw file):

	if ((ret = librpma_fio_server_open_file(td, f, cfg))) {
		librpma_td_verror(td, ret, "librpma_fio_server_open_file");

It is not an API call. librpma_fio_server_open_file() will log its exact reason of failure by itself.
Please remove the line.


engines/librpma_gpspm.c, line 695 at r4 (raw file):

Previously, ldorau (Lukasz Dorau) wrote…

It can be too expensive to call memset() in the critical path, so please remove it and and use rpma_cq_get_wc() with the num_entries_got argument (we already have the right mechanism in place):

	ret = rpma_cq_get_wc(cq, 1, wc, num_entries_got);

👍


engines/librpma_gpspm.c, line 707 at r4 (raw file):

Previously, ldorau (Lukasz Dorau) wrote…

..., wc, num_entries_got);

👍

Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @grom72 and @yangx-jy)


engines/librpma_gpspm.c, line 747 at r12 (raw file):

Previously, yangx-jy (Xiao Yang) wrote…

Done.

Yes, that's OK.

Copy link
Author

@yangx-jy yangx-jy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @grom72 and @ldorau)


engines/librpma_gpspm.c, line 747 at r12 (raw file):

Previously, ldorau (Lukasz Dorau) wrote…

Yes, that's OK.

Done.


engines/librpma_gpspm.c, line 744 at r13 (raw file):

Previously, ldorau (Lukasz Dorau) wrote…

Or even:

	if (ret != 1)
		return ret;

at the beginning.

Done. Cool.

Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r14, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @grom72)

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r14, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @yangx-jy)

a discussion (no related file):
I do not have anything to add. So it is time for solution benchmarking now.


Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @yangx-jy)

Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @yangx-jy)

a discussion (no related file):

Previously, grom72 (Tomasz Gromadzki) wrote…

I do not have anything to add. So it is time for solution benchmarking now.

I will benchmark it.


Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @yangx-jy)

a discussion (no related file):
Waiting for results of benchmarks ...


Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @yangx-jy)

a discussion (no related file):

Previously, ldorau (Lukasz Dorau) wrote…

Waiting for results of benchmarks ...

The results are following:

image.png


@yangx-jy
Copy link
Author

yangx-jy commented May 3, 2022

Hi @ldorau @grom72

I have updated this PR based on shared completion channel (i.e. devel branch on rpma).
I hope you can test if the latest PR doesn't downgrade the performance and give me feedback.

Copy link
Member

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 2 of 3 files reviewed, 1 unresolved discussion (waiting on @grom72, @ldorau, and @yangx-jy)

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @grom72, @ldorau, and @yangx-jy)


engines/librpma_gpspm.c line 755 at r15 (raw file):

		while (sd->msg_sqe_available == 0) {
			/* process the send completion */
			ret = server_cmpl_poll(td, csd->cq, &cq_wc);

let's use rpma_conn_wait to wait for cq event and ignore rcq events if we need to get space for at least one send completion

Suggestion:

			ret = server_cmpl_poll(td, csd->cq, &cq_wc);
			if (ret < 0)
				return ret;
			} else if (ret == 0 ) {
				do {
					ret = rpma_conn_wait(csd->conn, &cq1, &is_rcq);
						if (ret) {
							librpma_td_verror(td, ret, "rpma_conn_wait");
							td->terminate = true;
							return ret;
						}
					}
				} while(is_rcq);
				server_cmpl_poll(td, cq1, &cq_wc1)
			}

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r15, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @yangx-jy)

@yangx-jy
Copy link
Author

yangx-jy commented May 6, 2022

engines/librpma_gpspm.c line 755 at r15 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…

let's use rpma_conn_wait to wait for cq event and ignore rcq events if we need to get space for at least one send completion

Done.

I also updated the rest in server_queue_wait_poll().

Note this PR is based on the devel branch on rpma repo.

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r16, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @yangx-jy)

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yangx-jy it seems to be OK - now it is time for benchmarking. @osalyk can you do this?

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @yangx-jy)

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @yangx-jy)

a discussion (no related file):
We get new performance results but numbers are still much below expectation.
Let's make one more change to be more proactive in reading send completion.
See proposed changes below.



engines/librpma_gpspm.c line 714 at r16 (raw file):

	ret = server_cmpl_poll(td, sd->rcq, &rcq_wc);
	if (ret != 1)
		return ret;

Suggestion:

	/* process the receive completion */
	ret = server_cmpl_poll(td, sd->rcq, &rcq_wc);
	if (ret != 1) {
		if (ret == 0) {
			/* process the SEND completion if nothing to be done with RCQ*/
			ret = server_cmpl_poll(td, sd->cq, &cq_wc);
		}
		return ret;
	}

engines/librpma_gpspm.c line 744 at r16 (raw file):

	if (ret == 0) {
		do {

Suggestion:

	if (ret == 0) {
		/* process all available send completions before wait for receive */
		do {
			ret = server_cmpl_poll(td, sd->cq, &cq_wc);
			if (ret < 0)
				return ret;
		} while (ret);
		do {

engines/librpma_gpspm.c line 753 at r16 (raw file):

			/* process the receive or send completion */
			ret = server_cmpl_poll(td, cq, is_rcq ? &rcq_wc :

Suggestion:

			/* process ALL available receive or send completions */
			do {
				ret = server_cmpl_poll(td, cq, is_rcq ? &rcq_wc :
						&cq_wc);
				if (ret < 0)
					return ret;
			} while (ret);

Copy link
Author

@yangx-jy yangx-jy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 5 unresolved discussions (waiting on @grom72 and @yangx-jy)


engines/librpma_gpspm.c line 714 at r16 (raw file):

	ret = server_cmpl_poll(td, sd->rcq, &rcq_wc);
	if (ret != 1)
		return ret;

Done.


engines/librpma_gpspm.c line 744 at r16 (raw file):

	if (ret == 0) {
		do {

Done.


engines/librpma_gpspm.c line 753 at r16 (raw file):

			/* process the receive or send completion */
			ret = server_cmpl_poll(td, cq, is_rcq ? &rcq_wc :

This change will make rpma_conn_wait() blocking in the next loop (server_queue_wait_poll). So I didn't apply the change.

By the way, I wonder if the previous buffered queue also can improve the performance. Previous buffered queue can buffer several completions and processes them together later.

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r17, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @yangx-jy)


engines/librpma_gpspm.c line 753 at r16 (raw file):

Previously, yangx-jy (Xiao Yang) wrote…

This change will make rpma_conn_wait() blocking in the next loop (server_queue_wait_poll). So I didn't apply the change.

By the way, I wonder if the previous buffered queue also can improve the performance. Previous buffered queue can buffer several completions and processes them together later.

The sequence below shows an optimal algorithm for a queue with event channel processing.

  1. we wait for an event
  2. we process as many completions as possible
  3. we set up the next event to be reported
  4. we process completions that was received to HW between step 2 and 3
	while (1) {
		ibv_get_cq_event(dev->channel, &cq, (void *) &ep);

		cnt = 0;
		while (ibv_poll_cq(cq, 1, &wc) > 0) {
			cnt++;
			process_completion(&wc);
		}

		ibv_req_notify_cq(cq, 0);
		while (ibv_poll_cq(cq, 1, &wc) > 0) {
			cnt++;
			process_completion(&wc);
		}

		ibv_ack_cq_events(cq, cnt);
	}

We can not do this (yet) with the existing librpma API.
Se my alternative change proposal.


engines/librpma_gpspm.c line 768 at r17 (raw file):

			}

			/* process a available receive or send completions */

Let's try to collect more send completions than only one per event.

Suggestion:

			if (!is_rcq) {
				/* process as many as possible send completions */
				do {
					ret = server_cmpl_poll(td, cq, &cq_wc);
					if (ret < 0)
						return ret;
				} while (ret);
			} else {
				/* process ONE available receive completions */
				ret = server_cmpl_poll(td, cq, &rcq_wc);
			}

Also use shared completion channel.

Signed-off-by: Xiao Yang <[email protected]>
Copy link
Author

@yangx-jy yangx-jy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 3 unresolved discussions (waiting on @grom72 and @yangx-jy)


engines/librpma_gpspm.c line 753 at r16 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…

The sequence below shows an optimal algorithm for a queue with event channel processing.

  1. we wait for an event
  2. we process as many completions as possible
  3. we set up the next event to be reported
  4. we process completions that was received to HW between step 2 and 3
	while (1) {
		ibv_get_cq_event(dev->channel, &cq, (void *) &ep);

		cnt = 0;
		while (ibv_poll_cq(cq, 1, &wc) > 0) {
			cnt++;
			process_completion(&wc);
		}

		ibv_req_notify_cq(cq, 0);
		while (ibv_poll_cq(cq, 1, &wc) > 0) {
			cnt++;
			process_completion(&wc);
		}

		ibv_ack_cq_events(cq, cnt);
	}

We can not do this (yet) with the existing librpma API.
Se my alternative change proposal.

OK.


engines/librpma_gpspm.c line 768 at r17 (raw file):

Previously, grom72 (Tomasz Gromadzki) wrote…

Let's try to collect more send completions than only one per event.

Done.

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r18, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @yangx-jy)

Copy link

@grom72 grom72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ldorau and @yangx-jy)

a discussion (no related file):

Previously, ldorau (Lukasz Dorau) wrote…

The results are following:

image.png

R18 fixes latency:
image.png

but still, there are still be issues:
image copy 1.png

We need more time to check the root cause of such behavior. Some ideas:

  • with RCQ we have more events as every completion generates an event, in the case of "CQ only" consecutive cq_wc and rcq_wc generate one event if come immediately one after another
  • every event means context switching - many threads many context switching and kernel resources competition

Meanwhile, we could see if replacing rpma_cq_wait() with rpma_conn_wait() for the "only CQ" solution keeps the same performance.

Most probably we have to think about an alternative implementation of GPSPM in a long term. So far we can not promote GPSPM solution (without busy-wait polling) as an efficient one.


@ldorau ldorau added enhancement New feature or request and removed sprint goal labels May 25, 2022
@grom72 grom72 closed this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants