Monday, May 13, 2019

I'll show you mine if you show me your's, investigating OSPF's "B" bit


My dedication to delivering accurate content is what has postponed this post for more than 2 months. Every time I set out to document this behavior I'm double checking everything and always learning something new. Finally I have enough stable material to present...

[1940's Victory at Sea announcer voice...]
Investigating OSPF's "B" bit! The case of "I'll show you mine if you show me your's".

As I study for my CCIE R/S, I was working a lab recently and found myself confused by a lab task. This caused me to spend 2-days deep diving the technology through repetitive labbing and reading RFC’s to understand what was going on and to answer the question: "How can you exchange prefixes accross multiple areas in an OSPF domain when there is not an Area 0?"

Most seasoned engineers would utilize a virtual-link to extend area 0 across normal areas, but what if there is not area 0 to begin with?

Let’s use a simple topology(the control):

The Control Topology

R1:

int lo1111
ip add 1.1.1.1 255.255.255.255
ip ospf 1 area 0

int eth0/0
ip add 10.1.12.1 255.255.255.0
ip ospf 1 area 12

router ospf 1
area 12 virtual-link 2.2.2.2

R2:

int lo2222
ip add 2.2.2.2 255.255.255.255
ip ospf 1 area 2

int eth0/0
ip add 10.1.12.2 255.255.255.0
ip ospf 1 area 12

router ospf 1
area 12 virtual-link 1.1.1.1


In this topology you would build a virtual-link between Routers 1 & 2 across area 12 to extend area 0, so that R2 can share the networks of area 2 with the rest of the OSPF domain.

Now, lets change area 0 to area 1.
  • How can we establish full connectivity?
  • Create a virtual-link…?
Try it.

R1:

int lo1111
ip add 1.1.1.1 255.255.255.255
ip ospf 1 area 1

int eth0/0
ip add 10.1.12.1 255.255.255.0
ip ospf 1 area 12

router ospf 1
area 12 virtual-link 2.2.2.2

R2:

int lo2222
ip add 2.2.2.2 255.255.255.255
ip ospf 1 area 2

int eth0/0
ip add 10.1.12.2 255.255.255.0
ip ospf 1 area 12

router ospf 1
area 12 virtual-link 1.1.1.1


The Test Topology (no area 0)


You’ve configured a virtual-link but it won’t come up? Why not?

You can troubleshoot and run debugs ‘debug ip ospf adj’ but you won’t get any reason why this isn’t working.


When you’re completely lost and have nowhere else to turn the last place you look is the RFC. (I’m still trying to train myself to look at RFCs earlier on in the troubleshooting process)

From RFC2328 Section “15. Virtual Links”
...Virtual links serve to connect physically separate components of the backbone. The two endpoints of a virtual link are area border routers. The virtual link must be configured in both routers. The configuration information in each router consists of the other virtual endpoint (the other area border router), and the non-backbone area the two routers have in common (called the Transit area). Virtual links cannot be configured through stub areas (see Section 3.6).
So, the question is, how do 2 remote routers know if they are “area border routers”? In most Cisco documentation an “area border router” is defined as having at-least one interface in area 0 and other interfaces in one or more areas. How does the RFC suggest the routers identify as Area Border Routers?

In RFC2328, Section “12.4.1. Router-LSAs”

A router also indicates whether it is an area border router, or an AS boundary router, by setting the appropriate bits (bit B and bit E, respectively) in its router-LSAs. This enables paths to those types of routers to be saved in the routing table, for later processing of summary-LSAs and AS-external-LSAs. Bit B should be set whenever the router is actively attached to two or more areas, even if the router is not currently attached to the OSPF backbone area. Bit E should never be set in a router-LSA for a stub area (stub areas cannot contain AS boundary routers).
Ok, now we have something we can check. Let perform an experiment against a control and grab a PCAP between the 2 routers to see if the “B” bit is set coming from either direction in the Type-1 Router LSA's.

First let’s observe our first topology(the control) with an area 0, that does NOT YET have virtual-links configured.


The Control Topology

In the Type-1 Router LAS's, you see the “B” bit is set always coming from R1 first, then R2 will respond:

WireShark Filter: ospf.v2.router.lsa.flags.b == 0 || ospf.v2.router.lsa.flags.b == 1



It seems as if R1 (the router with area 0 configured) has to first tell the adjacent routers that it is an area border router before R2 will respond saying it too is an area border routers. This is just like the "I'll show you mine if you show me yours" game. R2 is being a little shy about exposing it's "B" bit until R1 does first.



R1:

router ospf 1
area 12 virtual-link 2.2.2.2
R2:

router ospf 1
area 12 virtual-link 1.1.1.1


Now, configure the virtual-links like you normally would and observe the PCAP again. After the virtual-links are fully adjacent, you will see both R1 and R2 are now setting the “B” and “V” bits. Which makes sense now that both routers have an interface in area 0, albeit because of the virtual-link itself. The "V" bit means that the source is a virtual-link end-points, and now that the virtual-link is fully adjacent they are both exchanging Type-1 Router LSA's with the "B" & "V" bit's set.



Moving over to the 'test' topology that DOES NOT have an area 0 or virtual-links configured, observe a PCAP while the routers form an adjacency. You'll see neither R1 or R2 have their “B” set.


The Test Topology (no area 0)


But wait I thought the RFC said:
Bit B should be set whenever the router is actively attached to two or more areas, even if the router is not currently attached to the OSPF backbone area.
This is exactly what we have; a router, with interfaces in 2 areas without a backbone area. The keyword in the RFC is “should”. In RFC parlance this means it’s optional for the vendors to implement. In this case Cisco does NOT initially set the “B” bit unless the router has at least 1 interface in area 0 and if neither router initiates setting the “B” bit then, the remote router doesn’t know the other is an Area Border Router and won’t form a virtual-link.

This is where the phrase "I'll show you mine if you show me your's" comes in. Both routers are being a little shy about their "B" bits and won't identify themselves as area border routers until a router with an interface in area 0 does so first.

There is a scenario when this is not the case.

Let’s take our topology, the one without an area 0, and configure its interfaces to be in a VRF and run OSPF in the VRF and observe a PCAP. You’ll see the “B” is now set. Wait… but why? There isn’t an area 0 configured? I thought you needed an 'area 0' in-order for a router to initiate setting the "B" bit?


The Testing Topology (no area 0)


R1:

ip vrf OSPF_TEST

int lo1111
ip vrf forwarding OSPF_TEST
ip add 1.1.1.1 255.255.255.255
ip ospf 1 area 1

int eth0/0
ip vrf forwarding OSPF_TEST
ip add 10.1.12.1 255.255.255.0
ip ospf 1 area 12

R2:

int lo2222
ip add 2.2.2.2 255.255.255.255
ip ospf 1 area 2

int eth0/0
ip add 10.1.12.2 255.255.255.0
ip ospf 1 area 12



Now, configure a virtual-link between the 2 routers, juts as you normally would.

The Test Topology (no area 0)

R1:

router ospf 1 vrf OSPF_TEST
area 12 virtual-link 2.2.2.2
R2:

router ospf 1
area 12 virtual-link 1.1.1.1

It comes up!!! Wait… but why? There isn’t an area 0 configured?

The answer can be found in Cisco documentation regarding the OSPF “capability vrf-lite” command: (https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_ospf/command/iro-cr-book/ospf-a1.html#wp2582896905)

The OSPF VRF process acts as an Area Border Router (ABR) when you configure an OSPF process that is associated with a VRF without the capability vrf-lite.

Simply put, the Master Peter Paluch(@Peter_Paluch) so eloquently stated on the Cisco forums: (https://community.cisco.com/t5/routing/where-to-configure-the-quot-capability-vrf-lite-quot-on-ce-or-pe/m-p/2812308/highlight/true#M260868)

...if an OSPF process is run in a VRF then it automatically and unconditionally considers itself to be an ABR - it believes to be connected to a so-called MPLS Superbackbone (even though there may be no BGP/MPLS configured on the router at all).
In conclusion, it is possible to to have fully-adjacent virtual links across an OSPF domain where area 0 does NOT exist. By putting an interface in a VRF, a router will advertises to it's adjacent routers that it is an area border router ABR.

Through my testing I have not found an instance where a router who DOES NOT have an interface in area 0 has initiated setting the "B" bit unless it is configured as part of a VRF.

Thanks to all who helped me through this 2-day deep dive: Nick Russo(@nickrusso42518) and the folks over at The Network Collective.