Monday, May 13, 2019

I'll show you mine if you show me your's, investigating OSPF's "B" bit


My dedication to delivering accurate content is what has postponed this post for more than 2 months. Every time I set out to document this behavior I'm double checking everything and always learning something new. Finally I have enough stable material to present...

[1940's Victory at Sea announcer voice...]
Investigating OSPF's "B" bit! The case of "I'll show you mine if you show me your's".

As I study for my CCIE R/S, I was working a lab recently and found myself confused by a lab task. This caused me to spend 2-days deep diving the technology through repetitive labbing and reading RFC’s to understand what was going on and to answer the question: "How can you exchange prefixes accross multiple areas in an OSPF domain when there is not an Area 0?"

Most seasoned engineers would utilize a virtual-link to extend area 0 across normal areas, but what if there is not area 0 to begin with?

Let’s use a simple topology(the control):

The Control Topology

R1:

int lo1111
ip add 1.1.1.1 255.255.255.255
ip ospf 1 area 0

int eth0/0
ip add 10.1.12.1 255.255.255.0
ip ospf 1 area 12

router ospf 1
area 12 virtual-link 2.2.2.2

R2:

int lo2222
ip add 2.2.2.2 255.255.255.255
ip ospf 1 area 2

int eth0/0
ip add 10.1.12.2 255.255.255.0
ip ospf 1 area 12

router ospf 1
area 12 virtual-link 1.1.1.1


In this topology you would build a virtual-link between Routers 1 & 2 across area 12 to extend area 0, so that R2 can share the networks of area 2 with the rest of the OSPF domain.

Now, lets change area 0 to area 1.
  • How can we establish full connectivity?
  • Create a virtual-link…?
Try it.

R1:

int lo1111
ip add 1.1.1.1 255.255.255.255
ip ospf 1 area 1

int eth0/0
ip add 10.1.12.1 255.255.255.0
ip ospf 1 area 12

router ospf 1
area 12 virtual-link 2.2.2.2

R2:

int lo2222
ip add 2.2.2.2 255.255.255.255
ip ospf 1 area 2

int eth0/0
ip add 10.1.12.2 255.255.255.0
ip ospf 1 area 12

router ospf 1
area 12 virtual-link 1.1.1.1


The Test Topology (no area 0)


You’ve configured a virtual-link but it won’t come up? Why not?

You can troubleshoot and run debugs ‘debug ip ospf adj’ but you won’t get any reason why this isn’t working.


When you’re completely lost and have nowhere else to turn the last place you look is the RFC. (I’m still trying to train myself to look at RFCs earlier on in the troubleshooting process)

From RFC2328 Section “15. Virtual Links”
...Virtual links serve to connect physically separate components of the backbone. The two endpoints of a virtual link are area border routers. The virtual link must be configured in both routers. The configuration information in each router consists of the other virtual endpoint (the other area border router), and the non-backbone area the two routers have in common (called the Transit area). Virtual links cannot be configured through stub areas (see Section 3.6).
So, the question is, how do 2 remote routers know if they are “area border routers”? In most Cisco documentation an “area border router” is defined as having at-least one interface in area 0 and other interfaces in one or more areas. How does the RFC suggest the routers identify as Area Border Routers?

In RFC2328, Section “12.4.1. Router-LSAs”

A router also indicates whether it is an area border router, or an AS boundary router, by setting the appropriate bits (bit B and bit E, respectively) in its router-LSAs. This enables paths to those types of routers to be saved in the routing table, for later processing of summary-LSAs and AS-external-LSAs. Bit B should be set whenever the router is actively attached to two or more areas, even if the router is not currently attached to the OSPF backbone area. Bit E should never be set in a router-LSA for a stub area (stub areas cannot contain AS boundary routers).
Ok, now we have something we can check. Let perform an experiment against a control and grab a PCAP between the 2 routers to see if the “B” bit is set coming from either direction in the Type-1 Router LSA's.

First let’s observe our first topology(the control) with an area 0, that does NOT YET have virtual-links configured.


The Control Topology

In the Type-1 Router LAS's, you see the “B” bit is set always coming from R1 first, then R2 will respond:

WireShark Filter: ospf.v2.router.lsa.flags.b == 0 || ospf.v2.router.lsa.flags.b == 1



It seems as if R1 (the router with area 0 configured) has to first tell the adjacent routers that it is an area border router before R2 will respond saying it too is an area border routers. This is just like the "I'll show you mine if you show me yours" game. R2 is being a little shy about exposing it's "B" bit until R1 does first.



R1:

router ospf 1
area 12 virtual-link 2.2.2.2
R2:

router ospf 1
area 12 virtual-link 1.1.1.1


Now, configure the virtual-links like you normally would and observe the PCAP again. After the virtual-links are fully adjacent, you will see both R1 and R2 are now setting the “B” and “V” bits. Which makes sense now that both routers have an interface in area 0, albeit because of the virtual-link itself. The "V" bit means that the source is a virtual-link end-points, and now that the virtual-link is fully adjacent they are both exchanging Type-1 Router LSA's with the "B" & "V" bit's set.



Moving over to the 'test' topology that DOES NOT have an area 0 or virtual-links configured, observe a PCAP while the routers form an adjacency. You'll see neither R1 or R2 have their “B” set.


The Test Topology (no area 0)


But wait I thought the RFC said:
Bit B should be set whenever the router is actively attached to two or more areas, even if the router is not currently attached to the OSPF backbone area.
This is exactly what we have; a router, with interfaces in 2 areas without a backbone area. The keyword in the RFC is “should”. In RFC parlance this means it’s optional for the vendors to implement. In this case Cisco does NOT initially set the “B” bit unless the router has at least 1 interface in area 0 and if neither router initiates setting the “B” bit then, the remote router doesn’t know the other is an Area Border Router and won’t form a virtual-link.

This is where the phrase "I'll show you mine if you show me your's" comes in. Both routers are being a little shy about their "B" bits and won't identify themselves as area border routers until a router with an interface in area 0 does so first.

There is a scenario when this is not the case.

Let’s take our topology, the one without an area 0, and configure its interfaces to be in a VRF and run OSPF in the VRF and observe a PCAP. You’ll see the “B” is now set. Wait… but why? There isn’t an area 0 configured? I thought you needed an 'area 0' in-order for a router to initiate setting the "B" bit?


The Testing Topology (no area 0)


R1:

ip vrf OSPF_TEST

int lo1111
ip vrf forwarding OSPF_TEST
ip add 1.1.1.1 255.255.255.255
ip ospf 1 area 1

int eth0/0
ip vrf forwarding OSPF_TEST
ip add 10.1.12.1 255.255.255.0
ip ospf 1 area 12

R2:

int lo2222
ip add 2.2.2.2 255.255.255.255
ip ospf 1 area 2

int eth0/0
ip add 10.1.12.2 255.255.255.0
ip ospf 1 area 12



Now, configure a virtual-link between the 2 routers, juts as you normally would.

The Test Topology (no area 0)

R1:

router ospf 1 vrf OSPF_TEST
area 12 virtual-link 2.2.2.2
R2:

router ospf 1
area 12 virtual-link 1.1.1.1

It comes up!!! Wait… but why? There isn’t an area 0 configured?

The answer can be found in Cisco documentation regarding the OSPF “capability vrf-lite” command: (https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_ospf/command/iro-cr-book/ospf-a1.html#wp2582896905)

The OSPF VRF process acts as an Area Border Router (ABR) when you configure an OSPF process that is associated with a VRF without the capability vrf-lite.

Simply put, the Master Peter Paluch(@Peter_Paluch) so eloquently stated on the Cisco forums: (https://community.cisco.com/t5/routing/where-to-configure-the-quot-capability-vrf-lite-quot-on-ce-or-pe/m-p/2812308/highlight/true#M260868)

...if an OSPF process is run in a VRF then it automatically and unconditionally considers itself to be an ABR - it believes to be connected to a so-called MPLS Superbackbone (even though there may be no BGP/MPLS configured on the router at all).
In conclusion, it is possible to to have fully-adjacent virtual links across an OSPF domain where area 0 does NOT exist. By putting an interface in a VRF, a router will advertises to it's adjacent routers that it is an area border router ABR.

Through my testing I have not found an instance where a router who DOES NOT have an interface in area 0 has initiated setting the "B" bit unless it is configured as part of a VRF.

Thanks to all who helped me through this 2-day deep dive: Nick Russo(@nickrusso42518) and the folks over at The Network Collective.

6 comments:

  1. Can I ask you a couple of questions (for my own understanding)?
    1) just to confirm, Cisco does not set the B bit nor advertises LSAs type 3 if it doesn't have a connection to area 0. In other words, the router does NOT consider itself an ABRS, correct?
    2) when you use the VRF trick to force the router to consider itself an ABR, you are placing both interfaces (the one in area 1 and the one in area 12 in the same VRF, correct?

    ReplyDelete
    Replies
    1. ylmva1, to answer your questions by number:

      1) you are correct. a router will not generate Type-3 Summary LSA's unless it has an interface in area 0 (or virtual link)
      2) you are also correct, I put both interfaces(area 1 and area 12) in the VRF

      Thanks for reading.

      Delete
  2. Thank you for your reply!

    A Juniper router do consider itself an ABR when configured with more than one area, regardless of whether one of those areas is area 0 or not. It actually set the B bit and advertise LSA-3. However, if an ABR receives an LSA-3 from another ABR but NOT from area 0, it does not inject an LSA type 3 into its local areas.

    I knew about Cisco's behavior regarding the B bit and LSAs type 3; but wanted to confirm. I didn't know about the VRF behavior though, but I imagine that that allows running OSPF over MPLS ...

    THANKS AGAIN!

    ReplyDelete
    Replies
    1. Your welcome. This thread on twitter, talks about exactly that, the difference between Junos and IOS regarding this type of topology.

      https://twitter.com/NetworkFunTimes/status/1126272090338988032

      Delete
  3. I think I started it! was challenging my fellow Ambassadors. ;-)

    The questions was: if you have this: "AREA 1 --- AREA 2 --- AREA 3 (OSPF) Which inter-area traffic is allowed?"
    A. Junos: area 1 <=> area 2, and area 2 <=> area 3.
    IOS: no area talks to any other area.

    ReplyDelete
  4. Hi Tony,

    Many observations you have shared here are accurate but I would like to correct a couple of conclusions you have drawn. It is a complex topic - in fact, I needed to dig through the source code to be absolutely sure myself.

    You are spot on in pointing out that while RFC 2328 suggests that routers should set the B-bit anytime they are connected to two or more areas, even if none of them is area 0, Cisco IOS-based routers behave differently. RFC 3509 discusses exactly these differences in which both Cisco and Juniper deviate from vanilla RFC 2328 - but not even this RFC gets it right ;) In particular, Cisco IOS OSPF implementation considers itself to be an ABR if:

    1.) The OSPF process runs in VRF and "capability vrf-lite" is not configured

    OR

    2.) The router has at least one interface up in area 0 AND at least one neighbor in FULL state in another non-backbone area (implies that there is at least one interface up in the non-backbone area as a prerequisite)

    Note the second condition: It requires that the router has at least one interface up in area 0, even a loopback will do, and in addition, the router must not only have an interface up in a non-backbone area, but it must in fact have a live neighbor in that area that is in the FULL state. In other words, a router that has two loopbacks, one in area 0, the other one in area 12, is not an ABR according to IOS. Even in your topology, if R1 has its Lo0 in area 0 and its e0/0 in area 12 but no adjacency to R2, it is not an ABR. There must be at least one non-backbone area with a fully adjacent neighbor, on top of having another interface up in area 0, for an IOS router to become an ABR.

    You have suggested that R2 is waiting for a B-bit in LSA1 from R1 to start advertising the B-bit itself. I do not believe this is true. If you set up your first topology without a virtual link, R2 has absolutely no reason to start advertising the B-bit because it can never become a rightful ABR according to the rules above: While R2 has a non-backbone area with a neighbor in FULL state (area 12 and R1 as the neighbor), it does not have an interface up in the area 0. Regardless of R1's B-bit setting, R2 cannot become an ABR and set the B-bit itself.

    Now, if you configure a virtual link on R2, it will create a virtual OSPF_VL interface that alone is in area 0. If this interface is considered up from OSPF perspective, it will be exactly what's missing on R2 to start considering itself as an ABR. The question is, then - when is OSPF_VL considered to be up? The answer: R2 must be able to resolve the endpoint of its virtual link - R1 - down to a reachable IP address in area 12, AND R1 must be already advertising the B-bit.

    And this is what is happening in your first topology when you add the virtual link config! After configuring it, OSPF_VL is created on R2 (it also gets created on R1 but from an ABR point of view, it's not that important because R1 was an ABR even without it), and once R2 is able to resolve the destination IP address of the virtual link and see that the other end (R1) already advertises the B-bit, it brings OSPF_VL up, thereby meeting all criteria to become an ABR itself, advertises an updated LSA1 with the B-bit set and start sending targeted hellos to R1. The moment R1 learns about the B-bit from R2, it also starts sending targeted hellos to R2, and they will eventually bring the virtual link fully up.

    This also explains why the virtual link does not come up in your second topology where none of your routers has an interface in area 0. Without having at least one interface in area 0, neither R1 nor R2 can become an ABR. Configuring a virtual link does not help alone because to bring the OSPF_VL interface up, the other end of the virtual link must be advertising a B-bit - but none of them can advertise it because none of them is an ABR before configuring the virtual link. That's why the virtual link does not ever come up; it is a chicken-and-egg scenario here.

    Please feel welcome to ask further!

    Best regards,
    Peter

    ReplyDelete