Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSPF TED failed when router LSA is received late #14994

Closed
odd22 opened this issue Dec 12, 2023 · 2 comments
Closed

OSPF TED failed when router LSA is received late #14994

odd22 opened this issue Dec 12, 2023 · 2 comments
Assignees
Labels

Comments

@odd22
Copy link
Member

odd22 commented Dec 12, 2023

In OSPFd, when mpls-te is activated, and mpls-te export set, the Traffic Engineering Database (TED) is build from the LSA (both received and advertised). However, when running some topotests, in particular the ospf__sr_te_topo1 on an heavy loaded box, the TED is not correctly formed.

The problem occurs when the router received a foreign Router LSA after the Opaque LSA for a given link. In this particular case, the Opaque LSA is first parse and a new edge is created in the TED. But, when the Router LSA is received and parse, instead of updating the corresponding edge, a new one is created and all TE link parameters are lost. If you wait until the Opaque LSA is refresh i.e. after 30 min., the TED is fulfill correctly.

This cause all process that intend to use the TED failed e.g. pathd because the corresponding edge has no TE link parameters and thus the edge is not usable from a TE point of view.

A simple show ip ospf mpls-te database will show the failure:

          Vertex (84215045): 5.5.5.5    Router Id: 5.5.5.5      Origin: OSPFv2  Status: Sync
            Type: Standard
            Segment Routing Capabilities:
                SRGB: [16000/23999]     SRLB: [15000/15999]     Algo: SPF       MSD: 8
            Outgoing Edges: 4
              To:       3.3.3.3(3.3.3.3)        Local:  10.0.4.5        Remote: 10.0.4.3
              To:       3.3.3.3(3.3.3.3)        Local:  10.0.5.5        Remote: 10.0.5.3
              To:       4.4.4.4(4.4.4.4)        Local:  10.0.6.5        Remote: 10.0.6.4
              To:       - (0.0.0.0)     Local:  10.0.8.5        Remote: 0.0.0.0                          <<===== Failure
            Incoming Edges: 4
              From:     3.3.3.3(3.3.3.3)        Local:  10.0.4.3        Remote: 10.0.4.5
              From:     3.3.3.3(3.3.3.3)        Local:  10.0.5.3        Remote: 10.0.5.5
              From:     4.4.4.4(4.4.4.4)        Local:  10.0.6.4        Remote: 10.0.6.5
              From:     6.6.6.6(6.6.6.6)        Local:  10.0.8.6        Remote: 10.0.8.5
            Subnets: 5
              Prefix:   5.5.5.5/32
              Prefix:   10.0.4.5/32
              Prefix:   10.0.5.5/32
              Prefix:   10.0.6.5/32
              Prefix:   10.0.8.5/32

Here, the link 10.0.8.5 / 10.0.8.6 is not correctly discovered and attached between router ID 5.5.5.5 and router ID 6.6.6.6

          Edge (10.0.8.5): 10.0.8.5     Adv. Vertex: 5.5.5.5    Metric: 10      Status: Sync
            Origin: OSPFv2
            Local IPv4 address: 10.0.8.5

Which is coherent with an edge without remote IP address and no TE link information

          LS age: 59
          Options: 0x42 : *|O|-|-|-|-|E|-
          LS Flags: 0x6
          LS Type: Area-Local Opaque-LSA
          Link State ID: 1.0.0.5 (Area-Local Opaque-Type/ID)
          Advertising Router: 5.5.5.5
          LS Seq Number: 80000001
          Checksum: 0x607d
          Length: 116

          Opaque-Type 1 (Traffic Engineering LSA)
          Opaque-ID   0x5
          Opaque-Info: 96 octets of data
          Router-Address: 5.5.5.5
          Link: 84 octets of data
          Link-Type: Point-to-point (1)
          Link-ID: 6.6.6.6
          Local Interface IP Address(es): 1
            #0: 10.0.8.5
          Remote Interface IP Address(es): 1
            #0: 10.0.8.6
          Maximum Bandwidth: 1.25e+06 (Bytes/sec)
          Maximum Reservable Bandwidth: 1.25e+06 (Bytes/sec)
          Unreserved Bandwidth per Class Type in Byte/s:
            [0]: 1.25e+06 (Bytes/sec),  [1]: 1.25e+06 (Bytes/sec)
            [2]: 1.25e+06 (Bytes/sec),  [3]: 1.25e+06 (Bytes/sec)
            [4]: 1.25e+06 (Bytes/sec),  [5]: 1.25e+06 (Bytes/sec)
            [6]: 1.25e+06 (Bytes/sec),  [7]: 1.25e+06 (Bytes/sec)

While the Opaque LSA is present with all TE link parameters

@odd22 odd22 added the triage Needs further investigation label Dec 12, 2023
@odd22 odd22 self-assigned this Dec 12, 2023
@odd22 odd22 added ospf and removed triage Needs further investigation labels Dec 12, 2023
@odd22
Copy link
Member Author

odd22 commented Dec 13, 2023

The problem is located in file ospfd/ospf_te.c ospf_te_parse_router_lsa() lines 1912-1914. The goal is to mark all edges (and just right after) all subnets as ORPHAN to detect removal as there is no explicit LSA to remove link or node. Just a refresh with updated list of link / node / prefix and you need to compare with that you have stored from previous LSAs. At the end of the ospf_te_parse_router_lsa() function, a call to ls_vertex_clean() will remove all edges and subnets that are marked as ORPHAN i.e. all edges and subnets that have not been refreshed.

However, an ospf router could send multiple Router LSA and not only one. Thus, the previous method is only working if all prefixes are advertised in the same Router LSA. In case of multiple Router LSA, previously configured edges and subnets are removed. If TED will finally converge with LSA refresh mechanism, it could spent at minimum MAX_AGE (i.e. 3600 sec.) leaving the TED in an incoherent state during a large amount of time, in particular during some tests.

To overcome this issue, edges and subnets MUST be removed only when TE Opaque LSA are flushed and not base on the advertisement of Router LSA. The latter will be used only to create or update edges and subnet.

@odd22
Copy link
Member Author

odd22 commented Dec 19, 2023

Corrected with #15026

@odd22 odd22 closed this as completed Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant