Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastDDS -> OpenDDS ownership failures #35

Closed
jrw972 opened this issue Apr 1, 2024 · 4 comments
Closed

FastDDS -> OpenDDS ownership failures #35

jrw972 opened this issue Apr 1, 2024 · 4 comments

Comments

@jrw972
Copy link
Contributor

jrw972 commented Apr 1, 2024

Executable's name

  • Publisher: eprosima_fastdds-2.13.3_shape_main_linux
  • Subscriber: opendds-3.27.0_shape_main_linux

Reproducing the problem

  • Test Suite: rtps_test_suite_1
  • Test Case: Test_Ownership_3 (and possibly others)
  • Link to the GitHub Action workflow run: NA

What is the problem?

  • Publisher expected code: OK
  • Publisher produced code: OK
  • Subscriber expected code: RECEIVING_FROM_BOTH
  • Subscriber produced code: RECEIVING_FROM_ONE

Suggestions about why this problem exists

The output of the subscriber consisted of 87 BLUE samples followed by 55 RED samples and a truncated red sample.
Since the test driver is only looking for max_samples_received = 125, it would only process 38 RED samples.
Since, there is no interleaving, the test fails the subscriber with RECEIVING_FROM_ONE.

Packet capture revealed the following about the conversation of the RED publisher and the subscriber.
The notation [X] indicates packet X in the attached capture.
capture.pcap.gz

  1. OpenDDS sends subscription for Square reader [82].
  2. FastDDS sends directed heartbeat. Last is 15, first is 16. [84]
  3. FastDDS sends samples 16 through 30.
  4. FastDDS sends publication for Square writer [132].
  5. OpenDDS sends preassociation acknack [133].
  6. FastDDS sends samples 31 through 43.
  7. OpenDDS sends preassociation acknack [160].
  8. FastDDS sends samples 44 through 71.
  9. OpenDDS sends preassociation acknack [221].
  10. FastDDS sends samples 72 through 98.
  11. FastDDS sends directed heartbeat. Last is 98, first is 16. [286].
  12. FastDDS sends sample 99.
  13. OpenDDS sends acknack requesting samples 16-30 [289].
  14. FastDDS sends samples 16-30 [291].
  15. FastDDS sends samples 100 through 116.

Logically, the heartbeat in [84] should be deferred to at least the publication announcement in [132].
Moreover, it appears that FastDDS is ignoring the non-final preassociation acknacks [133, 160, 221].
Essentially, the delay in discovery combined with the "late" heartbeat causes the samples to be queued so that samples 16 through 99 are probably delivered at once.
This explains the output of the subscriber program.

The delay in discovery seems to be a similar problem.
OpenDDS sends a preassociation acknack for the publication writer in [64], [96], [108], and [121].
It receives a heartbeat in [122] and [124].
It then requests the publication in [128] which is sent in [132].
The publication is acknowledged in [192] after a heartbeat in [189] and [190].

Other comments

@MiguelCompany
Copy link
Contributor

@jrw972 Thank you for opening the issue and for the deep analysis.

I've checked the code, and it seems the issue is that Fast DDS is ignoring acknacks with count == 0

We'll change it to something like the following:

    bool check_and_set_acknack_count(
            uint32_t acknack_count)
    {
        if (acknack_count >= next_expected_acknack_count_)
        {
            next_expected_acknack_count_ = acknack_count;
            ++next_expected_acknack_count_;
            return true;
        }

        return false;
    }

In the mean time, is there some setting you can configure in OpenDDS to make the count start in 1?

@MiguelCompany
Copy link
Contributor

@jrw972 I opened eProsima/Fast-DDS#4639 which should fix this, will upload a new binary after merging it.

@jrw972
Copy link
Contributor Author

jrw972 commented Apr 2, 2024

In the mean time, is there some setting you can configure in OpenDDS to make the count start in 1?

Unfortunately, that would require a code change.

@MiguelCompany
Copy link
Contributor

Closing according to the test results on this action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants