NoSwitchport.com

Quick and Dirty Web Proxy Load Balancing Using PAC Files

Posted in Networking by shaw38 on October 21, 2010

While digging through the archives of the Cisco Ironport support knowledge base, I came across a pretty slick solution for load balancing client web traffic between two or more proxy servers in a PAC file-based deployment scenario. So what is a PAC file? A PAC file is a text file containing policy information written in JavaScript and interpreted by a web browser each time a HTTP request is made.  The policy defines what and where web traffic should be sent by the web browser, either to the proxy server or bypass the proxy server. Typically, this is a fairly vanilla policy. For example (see comments for detail):

function FindProxyForURL(url, host)
{
	if (isPlainHostName(host) || dnsDomainIs(host, ".test.net")) // Bypass for non-dotted hostname or test.net domain
		return "DIRECT";
	else if (isInNet(host, "10.0.0.0", "255.0.0.0"))		// Bypass proxy for RFC1918
        return "DIRECT";
	else if (isInNet(host, "172.16.0.0", "255.240.0.0"))	// Bypass proxy for RFC1918
        return "DIRECT";
	else if (isInNet(host, "192.168.0.0", "255.255.0.0"))	// Bypass proxy for RFC1918
        return "DIRECT";
	else if (isInNet(host, "127.0.0.0", "255.0.0.0"))		// Bypass proxy for RFC3330
        return "DIRECT";
 	else
        return "PROXY IPROXY01:8080; IPROXY02:8080 DIRECT";		
}													

Using the PAC file above, all web traffic not matching a conditional statement resulting in a “direct” action will be sent only to iproxy01 in a steady state. Upon failure of iproxy01 traffic be sent to iproxy02. What if we have 40Mb of internet traffic we would like to load balance between the two? We could deploy WCCP and move to transparent redirection but are there any options with a PAC file? Absolutely!

Since a PAC file is JavaScript-based, the plethora of Java classes are at your disposal to manipulate policy as you see fit. We’ll need to instruct the web browser to send connections to either web proxy using the result of some sort of algorithm. To accomplish this, we can write a Java function using a couple objects from the math class:

function selectRandomProxy()
{
	switch( Math.floor( Math.random() *2))		// Randomly generate an integer of 0 or 1
	{
		case 0: return "PROXY IPROXY01:8080; PROXY IPROXY02:8080; DIRECT;"
		case 1: return "PROXY IPROXY02:8080; PROXY IPROXY01:8080; DIRECT;"
	}
}												

This function (selectRandomProxy()) will randomly select either case 0 which sends web traffic to iproxy01 or case 1 which sends web traffic to iproxy02. Using Math.random(), a random value will be select between 0.0 and 1.0 (i.e. 0.7234213). This value is then multiplied by 2. Math.floor() will then normalize the result to the closest integer no greater than the original result. For example, if Math.random() generates a random value of 0.25 which is then multiplied by two (0.50), Math.floor() would normalize this to an integer of zero. If Math.random() generates a random value of 0.75 which is then multiplied by two (1.50), Math.floor() would normalize this to an integer of one. A switch statement then evaluates the resulting integer value against a list of cases and returns the case matching the integer.

Now we’ll integrate this new function into our original PAC file:

function FindProxyForURL(url, host)
{
      if (isPlainHostName(host) || dnsDomainIs(host, ".chesco.org"))
            return "DIRECT";
	  else if (isInNet(host, "10.0.0.0", "255.0.0.0"))
            return "DIRECT";
	  else if (isInNet(host, "172.16.0.0", "255.240.0.0"))
            return "DIRECT";
	  else if (isInNet(host, "192.168.0.0", "255.255.0.0"))
            return "DIRECT";
	  else if (isInNet(host, "127.0.0.0", "255.0.0.0"))
            return "DIRECT";
 	  else
            return selectRandomProxy();
}
function selectRandomProxy()
{
	switch( Math.floor( Math.random() *2))
	{
		case 0: return "PROXY IPROXY01:8080; PROXY IPROXY02:8080; DIRECT;"
		case 1: return "PROXY IPROXY02:8080; PROXY IPROXY01:8080; DIRECT;"
	}
}

The web browser will evaluate the configured PAC file prior to every new HTTP connection and over time, result in nearly a 50/50 distribution of traffic between both web proxies.

A word of caution: There is no intelligence or session tracking with the load balancing decision making. It’s completely stateless. During a single HTTP session, objects will be fetched using both web proxies. While this isn’t necessarily an issue from the perspective of the web proxies, this may wreak some havoc on web apps behind a load balancer relying on session stickiness by source IP address. As HTTP objects part of the same session are fetched from two different source IP addresses (two web proxies), this will look like a new session to the destination load balancer and may not be “stuck” to the same real server. As long as both proxies are PAT’d to the same address, this shouldn’t cause an issue. Also, if you are doing any type of SSL termination on your web proxies for content inspection, this will cause you some problems.

Mapping Cisco ASA VPN Users to Tunnel Groups via Tunnel-Group-Lock on ACS

Posted in Networking by shaw38 on August 13, 2010

Over the past 9 months since I changed jobs, I’ve had a ton of opportunities to work with the the Cisco ASA. Up until this point, our only clients utilizing the IPSec VPN client have been internal employees authenticated against Active Directory (via ACS) and a handful of vendors with locally created accounts on the ASA. However, a recent security policy change has dictated all users, both employees and vendors, must be authenticated against Active Directory but Radius must be utilized for accounting. No big deal, right? Move the local accounts on the ASA to AD, create new group mappings in ACS and communicate with the vendors. Not so fast. As it turns out, this will all work fine but, under the covers, a security problem has been created. Let’s look at the problem at a high level.

Here’s the configuration:

  • In the ASA, there are two tunnel-groups created: Employee and Vendor. The Employee tunnel-group allows full access to the internal network. The Vendor tunnel-group allows restricted access to the internal network.
  • In Active Directory, there are two groups/OUs created: Employee and Vendor.
  • In Radius, there are two groups created: Employee and Vendor. These are mapped to their respective groups in AD.

The Good

Let’s follow vendor JoeyNT logging into his VPN client, specifying tunnel-group “Vendor”:

Vendor JoeyNT is successfully authenticated! Nice job, Joey. As you can see:

  1. The clients credentials are passed from the ASA to Radius to Active Directory (AD).
  2. When Radius passes the client’s credentials to AD, it will also ask for which groups(OUs) the user belongs via Microsoft Net Logon. These are then passed back to Radius upon successful authentication.
  3. The client is placed into the AD-mapped group in Radius
  4. Radius finally sends an auth success/failure message back to the ASA
  5. In this case, the ASA receives an auth success and permits the client

The one important item to note is tunnel-group membership is not conveyed in any way between Radius and the ASA, by default. Radius simply sends a pass/fail message. This becomes the root of our problem.

The Bad

Now vendor JoeyNT is dissatisfied with his level of access and is feeling a bit ambitious. He has acquired the “Employee” PCF file from a developer within our organization to help with “support issues”. He logs into his VPN client with the Employee tunnel-group specified:

As you can see, JoeyNT is authenticated successfully and is placed into the Employee tunnel-group, allowing full access to the internal network, even though he is part of the Vendor OU in AD and placed in the Vendor group in Radius. Uh oh.

The Solution

What we need is a way for Radius to tell the ASA what tunnel-group the client should be allowed, along with the auth success message.

Enter vendor-specific-attribute (VSA) 3076/85 – Tunnel-Group-Lock.

By enabling this, Radius will send the locally configured tunnel group name to the ASA, based on the AD-to-Radius group mapping. If this does not match the tunnel-group to which the client is attempting to join, the client will be denied access. With Tunnel-Group-Lock used, here is our oh so adventurous vendor JoeyNT attempting to authenticated to the Employee tunnel-group:

The Configuration

In Cisco ACS, this first needs to be enabled under Interface Configuration–>RADIUS (Cisco VPN 3000/ASA/PIX 7.x+):

Then check the box under [026/3076/085] Tunnel-Group-Lock and click submit:

Now under Group Setup, each group will have the following under the Cisco VPN 3000/ASA/PIX v7.x+ RADIUS Attributes section. The value specified here MUST match a configured tunnel-group on the ASA:

That’s it!

Footnote: I believe this can also be done using IETF class attribute 25 but I have not tested this.

Internet-based DMVPN coming through your front-door (VRF, that is)

Posted in Networking by shaw38 on April 20, 2010

While working on the design for a 20+ site DMVPN migration, I realized something often overlooked in the documentation for an internet-based DMVPN deployment. To maintain a zero (or minimal) touch deployment model in an internet-based DMVPN, default routing is a must for dynamic tunnel establishment between hubs and spokes. The public addressing of spoke routers is typically at the mercy of one or more service providers and even if you have been allocated a static address per the service contract, these still have a tendency to change due to reasons out of the customer’s control. This is especially true in teleworker-type deployments with a broadband service provider. To deal with this issue, an engineer has two options: maintain a list of static routes on every hub/spoke router comprised of every public and next-hop address in the DMVPN environment or use a static default route pointing out the public interface.

Tough decision, huh? Not so fast.

What happens when you have a transparent proxy deployed in your network at the hub site? No problem, just have the spoke routers carry a default route advertised into the IGP from the hub site. Wait…we are already using a default route to handle DMVPN tunnel establishment between spoke routers. To resolve this issue, we need two default routes: one for clients within the VPN and one for establishing spoke-to-spoke tunnels. We could add two defaults to the same routing table with the same administrative distance but load balancing is not the behavior we want and our tunnels would throw a fuss due to route recursion. How about policy-based routing with the local policy command configured for router-initiated traffic? Pretty ugly. Enter FVRF or Front-door VRF.

Front-door VRF takes advantage of the VRF-aware features of IPSec. While touted as a security feature in the scant Cisco documentation by separating your private routing table into an isolated construct from your public address space, this feature also provides an ideal solution for maintaining separate routing topologies for DMVPN control-plane traffic and user data-plane traffic.

So how does all this work? Pretty simply if you are familiar with the VRF concept. First, on your spoke routers, create a VRF to be used for resolving tunnel endpoints:

ip vrf FVRF
 description FRONT_DOOR_VRF_FOR_TUNNEL_MGMT
 rd 1:1

Add the publicly addressed or outside-facing interface to the VRF:

interface FastEthernet0/1
 ip vrf forwarding FVRF
 ip address 10.1.1.1 255.255.255.252

Now, we need to configure our ISAKMP/IPSec policy in a VRF-aware fashion:

crypto isakmp policy 1
 authentication pre-share
 group 2
 encr 3des
!
crypto keyring DMVPN vrf FVRF
 pre-shared-key address 0.0.0.0 0.0.0.0 key PR35H4R3D
!
crypto ipsec transform-set DMVPN esp-3des
 mode transport
!
crypto ipsec profile DMVPN
 set security-association lifetime seconds 1800
 set transform-set DMVPN 
 set pfs group2

Note the only VRF-specific configuration is the crypto keyring statement. Both the ISAKMP policy and IPSec transform-set configuration is no different than a typical deployment. GET VPN could be used instead, if your security posture calls for it.

Next up–configuration of the mGRE interface:

interface Tunnel1
 ip address 10.2.2.2.1 255.255.255.0
 no ip redirects
 ip mtu 1400
 ip nhrp authentication DMVPN
 ip nhrp map multicast 2.2.2.2
 ip nhrp map 10.2.2.254 2.2.2.2
 ip nhrp network-id 1
 ip nhrp holdtime 450
 ip nhrp nhs 10.2.2.254
 ip nhrp shortcut
 ip nhrp redirect
 ip tcp adjust-mss 1360
 load-interval 30
 qos pre-classify
 tunnel source FastEthernet0/0
 tunnel mode gre multipoint
 tunnel key 1
 tunnel vrf FVRF
 tunnel protection ipsec profile DMVPN

Configuring the tunnel interface is standard fare except for the “tunnel vrf” argument. This command forces the far-side tunnel endpoint to be resolved in the VRF specified. By default, tunnel endpoint resolution takes place in the global table which is obviously not the behavior we want. Also, notice the “ip nhrp shortcut” and “ip nhrp redirect” arguments. These two commands mean we are using DMVPN Phase 3 and it’s fancy CEF rewrite capable for spoke-to-spoke tunnel creation.

Last, lets add our default route within the VRF:

ip route vrf FVRF 0.0.0.0 0.0.0.0 10.1.1.2 name DEFAULT_FOR_FVRF

And we’re done! At this point, assuming your hub site configuration is correct, you should have a working DMVPN tunnel.

In the output below, notice the “fvrf” and “ivrf” sections under tunnel interface 1. The concept of IVRF is the exact opposite of FVRF: tunnel control-plane traffic operates in the global routing table, and your private side operates in a VRF. IVRF can be tricky in that, if your spoke routers are managed over the tunnel, all management functionality (SNMP, SSH, etc.) must be VRF-aware. Recent IOS releases have been much better with VRF-aware features but YMMV:

Test-1841#sh crypto session detail 
Crypto session current status

Code: C - IKE Configuration mode, D - Dead Peer Detection     
K - Keepalives, N - NAT-traversal, T - cTCP encapsulation     
X - IKE Extended Authentication, F - IKE Fragmentation

Interface: Tunnel1
Uptime: 3d22h
Session status: UP-ACTIVE     
Peer: 2.2.2.2 port 500 fvrf: FVRF ivrf: (none)
      Phase1_id: 2.2.2.2
      Desc: (none)
  IKE SA: local 10.1.1.1/500 remote 2.2.2.2/500 Active 
          Capabilities:D connid:1048 lifetime:01:41:34
  IPSEC FLOW: permit 47 host 10.1.1.1 host 2.2.2.2
        Active SAs: 2, origin: crypto map
        Inbound:  #pkts dec'ed 114110 drop 0 life (KB/Sec) 4396354/1063
        Outbound: #pkts enc'ed 119898 drop 492 life (KB/Sec) 4396347/1063

You can now configure your favorite flavor of IGP as would normally would (globally, that is) without impacting DMVPN control-plane traffic. In this scenario, OSPF is used with the tunnel interfaces configured as a point-to-multipoint network type. The static default route in the FVRF table handles tunnel establishment while the dynamically-learned default via OSPF handles the user data plane within the VPN:

Test-1841#sh ip route vrf FVRF 0.0.0.0

Routing Table: FVRF
Routing entry for 0.0.0.0/0, supernet
  Known via "static", distance 1, metric 0, candidate default path
  Routing Descriptor Blocks:
  * 10.1.1.2
      Route metric is 0, traffic share count is 1
Test-1841#
Test-1841#
Test-1841#
Test-1841#sh ip route 0.0.0.0

Routing entry for 0.0.0.0/0, supernet
  Known via "ospf 100", distance 110, metric 101, candidate default path, type inter area
  Last update from 10.2.2.254 on Tunnel1, 3d23h ago
  Routing Descriptor Blocks:
  * 10.2.2.254, from 10.2.2.254, 3d23h ago, via Tunnel1
      Route metric is 101, traffic share count is 1

Front-door VRF works best when used on both hub and spoke routers. Why? Well, anytime a new spoke is to be provisioned, you have to do zero configuration on the hub site. Configure the spoke router, ship it out the door, and have the field plug it in at their convenience.

Tagged with: , ,

Link-state tracking, VMware ESX and You

Posted in Uncategorized by shaw38 on January 22, 2010

This post could also be titled “How to build a healthy, long-lasting relationship with your system administration team”. One of the most important (and overlooked) pieces of deploying VMware ESX in a network is handling an upstream network failure. Because larger organizations have segregated network and system administration teams, the switchport tends to be the demarcation of responsibility. Where this particularly fails is in the perceived reaction of a network component failure, be it an upstream switch or router.

With the increased push towards server consolidation and deployment of VMware, the “routed is better” mantra has become muted by the layer 2 requirements of virtual machine mobility. A virtualized server also can present cable density issues with each server possibly needing 6 NICs (2 x Production, 2 x VMKernal, 1 x Backup, 1 x iLO). From a network design perspective, a VMware deployment screams for a top of rack switching model. Top of rack switching and VMware ESX physical NIC (pNIC) failure detection methods can present some interesting challenges.

VMware ESX allows for two options to detect a upstream network failure: Beaconing Probing and Link Status. Here is an in-depth summary on both methods:

http://blogs.vmware.com/networking/2008/12/using-beaconing-to-detect-link-failures-or-beaconing-demystified.html

Basically, beacon probing is pretty awful if you’re a network admin. It will send broadcasts out each physical interface of the ESX server for EACH vlan configured (if using dot1q tagging which you should be). So that is:

p number of physical servers x n number of pNICs per server x v number of vlans = broadcast storm

Link status is the preferred failure detection method but it will only track the state of the local link (between the ESX server and the switch). This tells the ESX server nothing about the switch’s ability to forward frames. This is where link-state tracking comes in. Link-state tracking will convey the switch’s upstream link-state to the local link of the ESX server by creating a logic gate between upstream and downstream links.

Suppose you have the following loop-free network topology deployed in your data center:

The network detection failure method configured on the ESX server is link status. Most likely your ESX server is sending frames out both interfaces due to the particular load balancing configuration but in this case we are only interested in frames sent to the switch on the left. In the event the left switch’s uplink fails, we will experience a black hole situation for some of our traffic leaving the ESX server:

By utilizing link status as our ESX failure method detection, the ESX server merely tracks physical link state at layer 1 and the ability of the upstream switch to forward frames is not taken into account:

Link-state tracking configured on the switch will convey this uplink failure to the link directly connected to the ESX server. Let’s get our switch configured correctly (which is stupidly simple):

First, define your link state group globally:

Switch(config)#link state track 1

Then define your upstream links within the link state group:

interface GigabitEthernet1/0/1

link state group 1 upstream

Lastly, define your downstream links:

interface GigabitEthernet1/0/2

link state group 1 downstream

Now the upstream link state will be conveyed to the downstream links which will cause the link to the ESX server to be shutdown in the event the upstream switch link goes down. Interfaces are coupled in :

Once the upstream link failure occurs and the interface is marked as down, the resulting action created by link state tracking is to bring down all downstream interfaces:

By bringing down the physical state of the interfaces to the ESX servers, the action by ESX link status tracking will be to initiate a pNIC failover event:

This will in turn create an long and happy relationship between network and system administrators and eliminate another instance of finger pointing when redundancy fails to function correctly.

Optimized Edge Routing Overview

Posted in Networking by shaw38 on January 8, 2010

If you can tell me of a more understated topic on the CCIE Routing and Switching v4.0 lab blueprint than Optimized Edge Routing (OER), I’ll buy you a beer. This was quietly snuck into the blueprint in between policy-based routing and redistribution, both fairly straightforward topics. Should be no big deal right? False.

OER removes rigidness of standard IP routing where typical routing metrics are derived from physical layer measurements and in turn, dictates a generic routing policy for all traffic. OER does this by gathering higher-lever performance metrics through IP SLA and Netflow information and uses this to determine the most optimal exit point for certain destination prefixes or traffic classes. Once the ideal exit point has been decided, routing policy is dynamically updated to influence the specific traffic class.

Navigating through the configuration guide for OER can be daunting but configuration can be broken down into 5 steps:

1. Profile

  • The selection of a subset of traffic to optimize performance
  • Learns the flows passing through the router with the highest delay or throughput
  • Statically configure class of traffic to performance route

2. Measure

Once traffic has been profiled, metrics need to be generated against it. This is down through:

  • Passive monitoring – measuring performance of a traffic flow as the flow is traversing the data path
  • Active monitoring – generating/measuring synthetic traffic to emulate the traffic class being monitored
  • Both can be deployed: passive monitoring can be used to determine if the flow doesn’t conform to an oer policy and active monitoring can find the most optimized alternate path

3. Apply Policy

  • Performance metrics are compared to a set of low and high thresholds and a determination is made if the metrics are out of policy
  • Traffic class policies – defined for prefixes or for applications
  • Link policies – defined for entrance or exit links at the edge

4. Control

  • Traffic flow is modified to enhance network performance
  • Methods for modifying routing policy:
    • For traffic classes defined using a specific prefix, traditional routing information can be modified using BGP or an IGP to add/remove a route
    • For traffic classes defined by application (prefix + upper layer protocol), there are two methods:
      • Device specific: Policy-based routing
      • Network specific:
        • Overlay performance – MPLS or mGRE to reach any other device at the network edge
        • Context Enhanced Protocols – BGP/OSPF/EIGRP are enhanced to communicate upperlayer information with a prefix

5. Verify

  • Once traffic is flowing through the perferred exit point, the traffic class is verified again against the traffic policy
  • If it is determined to the traffic is still out of profile, the controls put in place are reverted and the measurement phase restarts
Tagged with: , ,

Route Redistribution (Over?)Simplified…

Posted in Uncategorized by shaw38 on January 7, 2010

Here’s the best document on route redistribution I’ve read so far:

http://www.cisco.com/application/pdf/paws/8606/redist.pdf

And it all can be summed up in this statement:

“Avoiding these types of problems is really quite simple: never announce the information originally received from routing process X back into routing process X.”

And it truly is that simple. Always mark/tag/color routes based on their source routing domain and when redistributing, select which routes to redistribute. After all, routes are merely destination information. It’s all about who needs to know and from whom they need to know it.

Generating Pseudo-Random IPv6 Global IDs for Unique Local Unicast Addresses

Posted in Networking by shaw38 on January 4, 2010

I’ve been delving into the multiple RFCs associated with the creation of IPv6 this weekend and came across an interesting section in RFC 4193 – Unique Local IPv6 Unicast Addresses. First, a little background:

IPv6 unique local unicast addresses are the equivalent of IP version 4 RFC 1918 space in most ways and are formatted in the following fashion:

  • 7-bit Prefix – FC00::/7
  • 1-bit Local bit (position 8 ) – Always set to “1”…for now
  • 40-bit “kinda-almost-unique” Global ID
  • 16-bit Subnet-ID
  • 64-bit Interface ID

The intention and scope of these addresses is for unicast-based intra/inter-site communication. The definition of a “site” within the plethora of IPv6 RFCs is slightly ambiguous  but in the case of RFC 4193, the demarcation of a “site” is between ISP and customer.  According to the RFC, unique local unicast addresses are permitted to be used between “sites” i.e. customer-to-customer VPN communication but the FC00::/7 prefix is to be filtered by default at any site-border router. This space is not intended to be advertised to any portion of the internet.

Now the interesting portion of this RFC is the recommended algorithm for generating a realistically unique yet theoretically common 40-bit Global ID for your local unicast addresses. Section 3.2.2 recommends the following:

  1. Obtain the current time of day in 64-bit NTP format
    • i.e.  reference time is C029789C.45564D4E
  2. Obtain an EUI-64 identifier from the system running this algorithm
    • i.e. bia of C201.0DC8.0000
  3. Concatenate the time of day with the system-specific identifier in order to create a key
    • i.e C029789C45564D4E.C2010DC80000
  4. Compute an SHA-1 digest on the key; the resulting value is 160 bits
    • Here’s a handy web-based calculator to generate multiple different message digest flavors : http://www.fileformat.info/tool/hash.htm
    • Our resulting SHA-1 hash of C029789C45564D4E.C2010DC80000 is 4D958078E1C1C2f3DEBA10C1DC7899E6A21D2B9F
  5. Use the least significant 40 bits as the Global ID
    • Our 40 least-significant bits results in E6A21D2B9F (hint: just count over 10 hexadecimal digits from the right)
  6. Concatenate FC00::/7, the L bit set to 1, and the 40-bit Global ID to create a Local IPv6 address prefix
    • FC00::/7 with the L-bit (8th bit) set to 1 = FD00::/7
    • A concatenation of the prefix plus our generated Global ID and 16-bit Subnet ID = FDE6:A21D:2B9F::/64

Also included in the RFC is sample probabilities of IPv6 address prefix uniqueness depending on the number of peer connections to a site.  It’s safe to say if you experience an overlap using this method to assign Global IDs, play the damn lottery. While this method almost eliminates any overlap possibility between sites, the Global IDs generated with this method are hardly “pretty” numbers and there will undoubtedly be folks assigning Global IDs of ::1/40. If you have ever went through a merger/acquisition with IPv4, do yourself a favor and follow academia for assigning your Global IDs.

Tagged with:

OSPF Virtual Links in the Real World

Posted in Networking by shaw38 on May 13, 2009

I never thought I would come across the opportunity to use an OSPF virtual link in a production environment, but sure enough, yesterday was the day. The Maryland area had dual links from a 6500 running VSS into our OSPF backbone. Because of a fiber cut, both adjacencies into area 0 were lost (interfaces stayed up). This 6500 also had interfaces in area 18 as redundancy. In theory, the area 0 links would be lost, the 6500 would no longer be an ABR and traffic would re-route back through area 18 to the other ABR.

False.

Why? The 6500 still believed it was an ABR and the loop prevention rules of OSPF ABRs kicked in:

  1. A type-3 LSA learned via a non-backbone area will not be forwarded back into the backbone area. This is similar to split-horizon with Distance Vector routing protocols.
  2. ABRs will ignore LSAs advertised by other ABRs when calculating least cost paths. An ABR must not select a path through a non-backbone area to reach the backbone area.

Rule #2 applies to this particular situation. Summary LSAs from area 0 were in the OSPF database of the 6500. However, the LSAs were not being considered for SPF calculation because they were learned via a non-backbone area.

A couple reasons why this failed:

  • The interfaces in area 0 on the 6500 stayed up, though, the neighbor adjacencies were lost so the 6500 still considered itself as an ABR. This caused area 0 to become partitioned relative to this ABR. In certain IOS releases (I believe) if the adjacency is lost the interface will be marked as “down” from an OSPF standpoint.  Here’s an example from 12.4 mainline code:
    *Mar 1 00:55:28.407: %OSPF-5-ADJCHG: Process 100, Nbr 10.4.4.2 on FastEthernet1/0 from FULL to DOWN, Neighbor Down: Dead timer expired
    !
    R4#sh int fast 1/0
    FastEthernet1/0 is up, line protocol is up
    !
    R4#sh ip ospf
    Area BACKBONE(0) (Inactive)
    Number of interfaces in this area is 1
    Area has no authentication
    SPF algorithm last executed 00:00:17.356 ago
    SPF algorithm executed 5 times
    Area ranges are
    Number of LSA 14. Checksum Sum 0x085E01
    Number of opaque link LSA 0. Checksum Sum 0x000000
    Number of DCbitless LSA 0
    Number of indication LSA 0
    Number of DoNotAge LSA 9
    Flood list length 0
     
  • While digging around during the outage, the 6500 chassis had two port-channel interfaces in area 0 pointing back into the location. I have no idea why. So even if the IOS version running behaved as decribed above, because of these port-channels, the 6500 would have still considered itself an ABR and not considered the summary LSAs learned via area 18 for SPF calculation.

There are a couple ways to address this situation:

  • Build a virtual link to the other ABR. :
  • ABR1#conf t
    Enter configuration commands, one per line. End with CNTL/Z.
    ABR1(config)#router ospf 100
    ABR1(config-router)#area 18 virtual-link 10.3.3.2
    !
    ABR2#conf t
    ABR2(config)#router ospf 100
    ABR2(config-router)#area 18 virtual-link 10.2.2.1
    !
    *Mar 1 01:33:56.075: %OSPF-5-ADJCHG: Process 100, Nbr 10.3.3.2 on OSPF_VL2 from LOADING to FULL, Loading Done
    !
    *Mar 1 01:33:58.403: %OSPF-5-ADJCHG: Process 100, Nbr 10.2.2.1 on OSPF_VL3 from LOADING to FULL, Loading Done
    !
    ABR2#show ip ospf interface
    OSPF_VL3 is up, line protocol is up
    Internet Address 10.3.3.2/30, Area 0
    Process ID 100, Router ID 10.3.3.2, Network Type VIRTUAL_LINK, Cost: 2


    Now what this will do is defeat the loop prevention rules mentioned above, specifically #2. As you can see, the virtual link between the ABRs is essentially the same as having a link in area 0. This will then allow each ABR to learn LSAs via area 0 and consider them for SPF calculation. This is a nice alternative to a tunnel because traffic is natively routed. If you look at the routing table, an intra-area router will be the next-hop for a route learned via the opposing ABR. There is a caveat in that virtual links cannot be used if the underlying transit area is a stub. Why? Intra-area stub routers lack full OSPF databases which means it lacks forwarding information which means there is a possibility of loops. As mentioned before regarding virtual links, traffic is not tunneled but natively routed so the intra-area routers must have complete forwarding information if they are to be used as a next-hop router for an ABR. This is similar to the iBGP full-mesh requirement.

EIGRP Edge Routing w/ DMVPN

Posted in Networking by shaw38 on April 20, 2009

We have a ton of remote sites (payments centers, small offices, etc) with a single router on-site, typically an 871 series hanging off a cable modem with a static IP. These routers are terminated via point-to-point GRE tunnels back to a pair of central hub sites. Because we run OSPF, and are poorly summarized, these routers can carry up to 7500 prefixes depending on the area type of the market.  All these routers really need is a default route back towards the two hubs in a active/standby model with return traffic to the spokes preferring the active hub site for symmetry. Secondly, provisioning new remotes requires touching both the hub and spoke routers for building tunnel interfaces. The configs on the hub sites can be fairly long and annoying to troubleshoot, especially with static IPs of the remote sites changing over time and lack of cleaning up old configs.

To address the size of the RIB on the spokes, we could use statics and redistribution on the hubs and floating statics on the spoke routers. It will reduce the RIB of the spokes but its fairly high maintenance on both of the hub sites. With a static for a loopback, voice and data vlans, your looking at least 90 statics on some of the hub routers. That’s not helping the config complexity problem. Distribute lists do not work with OSPF in the outbound direction and the interface-level “ip ospf database-filter all” won’t help us leak a default to the spokes. We need distance vector. EIGRP stub flags and an 0.0.0.0/0 summary towards the spokes would be perfect.  

To cut down on the tunnel interfaces, the obvious choice is DMVPN. There’s is little to no spoke-to-spoke traffic so DMVPN will purely serve as a tool for configuration simplification.

Here’s the DMVPN configuration for the two hub sites. First, notice there are no static unicast maps, multicast maps or nhs configuration pointing to the opposite hub site. Basically, I don’t want a tunnel and ultimately a EIGRP neighbor relationship built from Hub site 1 to Hub site 2. There would be no reason to have this in place and will only cause issues since both hubs are only advertising a default route out their tunnel interfaces. Secondly, Hub site 1 has it’s tunnel interface delay set to 100 so all spokes will prefer the default route via Hub site 1 after calculating their feasible distance.  Lastly, the default route being generated to the spokes is being set with an administrative distance of 254. The reason for this is because when you manually summarize, a summary-route is generated in the routing table pointing to Null0 with an administrative distance of 5. While this is not necessarily a problem for CIDR blocks where more specific prefixes exist in the RIB, this can cause traffic following a default route to be black-holed. We want to set this null route above any IGP-learned default route so it is not preferred.  Oh, and notice split-horizon and next-hop-self for EIGRP are not being disabled on the tunnel interface. We are not interested in spoke to spoke tunnels nor are we interested in spokes having all routes within the DMVPN. Disabling split-horizon would allow the spoke prefixes to be advertised back out the tunnel interfaces to the other spokes. Disabling next-hop-self would allow these prefixes to be advertised with a next-hop of the advertising spoke router (which is where the NHRP query would come into play for a spoke-to-spoke tunnel). 

Hub Site 1:
interface Tunnel0
ip address 10.255.0.1 255.255.255.0
no ip redirects
ip nhrp authentication ccie
ip nhrp map multicast dynamic
ip nhrp network-id 10
ip summary-address eigrp 10 0.0.0.0 0.0.0.0 254
delay 100
tunnel source FastEthernet1/0
tunnel mode gre multipoint
tunnel key 10
!
router eigrp 10
network 10.255.0.0 0.0.0.255
no auto-summary

Hub Site 2:
interface Tunnel0
ip address 10.255.0.2 255.255.255.0
no ip redirects
ip nhrp authentication ccie
ip nhrp map multicast dynamic
ip nhrp network-id 10
ip summary-address eigrp 10 0.0.0.0 0.0.0.0 254
delay 200
tunnel source FastEthernet1/0
tunnel mode gre multipoint
tunnel key 10
!
router eigrp 10
network 10.255.0.0 0.0.0.255
no auto-summary

 

Here’s the DMVPN configuration for the spoke sites. It’s pretty straightforward.  I found the “ip nhrp registration timeout” command had to be added on the spokes. When testing failure of a hub site, there were issues with EIGRP adjacencies being reformed with the failed hub router when it came back online. Because we don’t want Hub site 1 and Hub site 2 to be NHRP peers, the hub site’s will not query one another for NHRP mappings. So instead, the spokes are periodically broadcasting their presence every 5 seconds. When the hub comes back online, it will receive the registration message from the spoke, rebuild its mGRE tunnel and reform its EIGRP adjacency.  The spoke routers do not necessarily need to be configured as stubs as they should never be queried but it is good practice.

Spoke 1:
interface Tunnel0
ip address 10.255.0.3 255.255.255.0
no ip redirects
ip nhrp authentication ccie
ip nhrp map multicast dynamic
ip nhrp map 10.255.0.1 10.1.1.1
ip nhrp map multicast 10.1.1.1
ip nhrp map multicast 10.1.1.2
ip nhrp map 10.255.0.2 10.1.1.2
ip nhrp network-id 10
ip nhrp nhs 10.255.0.1
ip nhrp nhs 10.255.0.2
ip nhrp registration timeout 5
tunnel source FastEthernet0/0
tunnel mode gre multipoint
tunnel key 10
!
router eigrp 10
passive-interface default
no passive-interface Tunnel0
network 4.4.4.4 0.0.0.0
network 10.255.0.0 0.0.0.255
network 10.255.1.0 0.0.0.255
no auto-summary
eigrp stub connected

 

So that takes care of getting a dynamic default to the spokes, but now we need advertise the routes from the spokes back into the rest of the network. Remember, we set the delay of the tunnel interface of Hub site 1 so all spokes would prefer its default route. When we are redistributing EIGRP into OSPF, we will be injecting them as E2s with a metric of 100 from Hub site 1 and a metric of 200 from Hub site 2. This should give us traffic symmetry. Also, we are only permitting specific blocks for redistribution. We don’t want anyone routing any prefix they damn please. 

Hub Site 1:
router ospf 100
router-id 1.1.1.1
log-adjacency-changes
redistribute eigrp 10 subnets route-map EIGRP-->OSPF
!
ip prefix-list DMVPN_SPOKES seq 5 permit 10.255.0.0/16 le 32
!
route-map EIGRP-->OSPF permit 10
match ip address prefix-list DMVPN_SPOKES
set metric 100

Hub Site 2:
router ospf 100
router-id 2.2.2.2
log-adjacency-changes
redistribute eigrp 10 subnets route-map EIGRP-->OSPF
!
ip prefix-list DMVPN_SPOKES seq 5 permit 10.255.0.0/16 le 32
!
route-map EIGRP-->OSPF permit 10
match ip address prefix-list DMVPN_SPOKES
set metric 200

 

With this configuration, it should be zero touch on the hub site routers. I’ll be adding the crypto configurations in at a later date.

Tagged with: ,
Follow

Get every new post delivered to your Inbox.