VxLAN factory. Part 2

Hey Habr. I continue the series of articles on VxLAN EVPN technology, which were written specifically for the launch of the course "Network engineer" by OTUS. And today we will consider an interesting part of the tasks - routing. No matter how trite it may sound, however, as part of the work of a network factory, everything may not be so simple.

VxLAN factory. Part 2

1 part of the cycle - L2 connectivity between servers

In the last part, we achieved one broadcast domain built on top of a network fabric on a Nexus 9000v. However, this is not the whole range of tasks that need to be solved within the framework of the data center network. And today we will consider the following task - routing between networks or between VNIs.

Let me remind you that the Spine-Leaf topology is used:

VxLAN factory. Part 2

To begin with, we will analyze how routing occurs and what features it has.

For understanding, let's simplify the logic diagram and add another VNI 20000 for Host-2. The result is:

VxLAN factory. Part 2

How, in this case, can you transfer traffic from one Host to another?

There are two options:

  1. Keep information about all VNIs on all Leaf switches, then all routing will occur on the first Leaf in the network;
  2. Use dedicated - L3 VNI

The first way is simple and convenient. Since you only need to start all VNIs on all Leaf switches. However, running a few hundred or thousands of VNIs on the entire Leaf no longer seems like an easy task. Therefore, in the work it is used quite rarely.

We will analyze method 2, as more interesting and slightly more complicated, but giving more flexibility in setting up the factory.

Let's add "PROD" to the VRF topology. Let's add interface vlan 10 to it on the Leaf-11/12 pair and interface VLAN 20 on Leaf-21. VLAN 20 is associated with VNI 20000

vrf context PROD
  rd auto       ! Route Distinguisher Π½Π΅ ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΠΈΠ°Π»Π΅Π½ ΠΈ ΠΌΠΎΠΆΠ΅ΠΌ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚ΡŒ сформированный автоматичСски
  address-family ipv4 unicast
    route-target both auto      ! ΡƒΠΊΠ°Π·Ρ‹Π²Π°Π΅ΠΌ Route-target с ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΌ Π±ΡƒΠ΄ΡƒΡ‚ ΠΈΠΌΠΏΠΎΡ€Ρ‚ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒΡΡ ΠΈ ΡΠΊΡΠΏΠΎΡ€Ρ‚ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒΡΡ прСфиксы Π²/ΠΈΠ· VRF
vlan 20
  vn-segment 20000

interface nve 1
  member vni 20000
    ingress-replication protocol bgp

interface Vlan10
  no shutdown
  vrf member PROD
  ip address 192.168.20.1/24
  fabric forwarding mode anycast-gateway

In order to use L3VNI, you need to create a new VLAN, associate it with the new VNI. The new VNI must be the same on all Leafs interested in VLAN 10 and 20 information.

vlan 99
  vn-segment 99000

interface nve1
  member vni 99000 associate-vrf        ! Π‘ΠΎΠ·Π΄Π°Π΅ΠΌ L3 VNI

vrf context PROD
  vni 99000                             ! ΠŸΡ€ΠΈΠ²ΡΠ·Ρ‹Π²Π°Π΅ΠΌ L3 VNI ΠΊ ΠΎΠΏΡ€Π΅Π΄Π΅Π»Π΅Π½Π½ΠΎΠΌΡƒ VRF

As a result, the diagram will look like this:

VxLAN factory. Part 2

It remains to finish a little - add one more interface - interface vlan 99 in VRF PROD

interface Vlan99
  no shutdown
  vrf member PROD
  ip forward  ! На интСрфСйсС Π½Π΅ Π΄ΠΎΠ»ΠΆΠ½ΠΎ Π±Ρ‹Ρ‚ΡŒ IP. Π˜ΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ΡΡ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ для пСрСсылки ΠΏΠ°ΠΊΠ΅Ρ‚ΠΎΠ² ΠΌΠ΅ΠΆΠ΄Ρƒ Leaf

As a result, the logic of passing the frame from Host-1 to Host-2 is as follows:

  1. A frame sent by Host-1 arrives on a Leaf in VLAN 10, which is associated with VNI 10000;
  2. Leaf checks where the destination address is and finds it via L3 VNI on the second Leaf switch;
  3. As soon as the route to the destination address is found, the Leaf packs the frame into a header with the necessary L3VNI 99000 - and sends it towards the second Leaf;
  4. The second Leaf switch receives data from L3VNI 99000. Gets the original frame and transfers it to the required L2VNI 20000 and then to VLAN 20.

As a result of this work, L3VNI removes the need to keep information about all VNIs that are on the network on all Leaf switches.

As a result, when we send traffic from Host-1 to Host-2, the packet is packed inside VxLAN with the new VNI - 99000:

VxLAN factory. Part 2

It remains to be seen how exactly Leaf-1 learns about the MAC address from another VNI. This also happens with the help of EVPN route-type 2 (MAC / IP).

The following shows the process of propagating a route about a prefix located in another VNI:

VxLAN factory. Part 2

That is, addresses received from VNI 20000 have two RTs.
Let me remind you that the routes received from Update fall into the BGP table with the Route-target specified in the VRF settings (the process is somewhat more complicated, but we will not go into this article).
The RT itself is formed by the formula: AS:VNI (if automatic mode is used).

An example of RT formation in automatic and manual modes:

vrf context PROD
  address-family ipv4 unicast
    route-target import auto - автоматичСский Ρ€Π΅ΠΆΠΈΠΌ Ρ€Π°Π±ΠΎΡ‚Ρ‹
    route-target export 65001:20000 - Ρ€ΡƒΡ‡Π½ΠΎΠΉ Ρ€Π΅ΠΆΠΈΠΌ формирования RT

As a result, you can see above that prefixes from another VNI have two RT values.
One of them 65001:99000 is an additional L3 VNI. Since this VNI is the same on all Leafs and falls under our import rules in the VRF settings, the prefix gets into the BGP table, which can be seen from the output:

sh bgp l2vpn evpn
<.....>
   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 10.255.1.11:32777    (L2VNI 10000)
*>l[2]:[0]:[0]:[48]:[5001.0007.0007]:[0]:[0.0.0.0]/216
                      10.255.1.10                       100      32768 i
*>l[2]:[0]:[0]:[48]:[5001.0007.0007]:[32]:[192.168.10.10]/272
                      10.255.1.10                       100      32768 i
*>l[3]:[0]:[32]:[10.255.1.10]/88
                      10.255.1.10                       100      32768 i

Route Distinguisher: 10.255.1.21:32787
* i[2]:[0]:[0]:[48]:[5001.0008.0007]:[32]:[192.168.20.20]/272    ! ΠŸΡ€Π΅Ρ„ΠΈΠΊΡ ΠΏΠΎΠ»ΡƒΡ‡Π΅Π½Π½Ρ‹ΠΉ ΠΈΠ· VNI 20000
                      10.255.1.20                       100          0 i
*>i                   10.255.1.20                       100          0 i

If we look more closely at the received update, we can see that this prefix has two RTs:

Leaf11# sh bgp l2vpn evpn 5001.0008.0007
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 10.255.1.21:32787
BGP routing table entry for [2]:[0]:[0]:[48]:[5001.0008.0007]:[32]:[192.168.20.2
0]/272, version 5164
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not i
n HW

  Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
  AS-Path: NONE, path sourced internal to AS
    10.255.1.20 (metric 81) from 10.255.1.102 (10.255.1.102)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 20000 99000                                 ! Π”Π²Π° label для Ρ€Π°Π±ΠΎΡ‚Ρ‹ VxLAN
      Extcommunity: RT:65001:20000 RT:65001:99000 SOO:10.255.1.20:0 ENCAP:8     ! Π”Π²Π° значСния Route-target, Π½Π° основС, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Ρ… Π΄ΠΎΠ±Π°Π²ΠΈΠ»ΠΈ Π΄Π°Π½Π½Ρ‹ΠΉ прСфикс
          Router MAC:5001.0005.0007
      Originator: 10.255.1.21 Cluster list: 10.255.1.102
<......>

In the routing table on Leaf-1, you can also see the prefix 192.168.20.20/32:

Leaf11# sh ip route vrf PROD
192.168.10.0/24, ubest/mbest: 1/0, attached
    *via 192.168.10.1, Vlan10, [0/0], 01:29:28, direct
192.168.10.1/32, ubest/mbest: 1/0, attached
    *via 192.168.10.1, Vlan10, [0/0], 01:29:28, local
192.168.10.10/32, ubest/mbest: 1/0, attached
    *via 192.168.10.10, Vlan10, [190/0], 01:27:22, hmm
192.168.20.20/32, ubest/mbest: 1/0                                        ! АдрСс Host-2
    *via 10.255.1.20%default, [200/0], 01:20:20, bgp-65001, internal, tag 65001     ! Доступный Ρ‡Π΅Ρ€Π΅Π· Leaf-2
(evpn) segid: 99000 tunnelid: 0xaff0114 encap: VXLAN                                ! Π§Π΅Ρ€Π΅Π· VNI 99000

Notice the missing primary prefix 192.168.20.0/24 in the routing table?
That's right, he's not there. That is, remote Leafs receive information only about the hosts that are on your network. And this is the correct behavior. Above, in all updates, you can see that information comes with the content of MAC / IP. There are no prefixes to speak of.

This is the Host Mobility Manager (HMM) protocol, which fills the ARP table from which the BGP table is further filled (we will omit this process within the framework of this article). Based on the information received from the HMM, route-type 2 EVPNs are formed (transmitted by MAC / IP).

However, what if there is a need to pass information about a prefix?

For this type of information, there is EVPN route-type 5 - it allows you to send prefixes via address-family l2vpn evpn (this type of route at the time of this writing is only in the draft version RFC, because of this, different manufacturers may have different behavior of this type of route)

To transfer prefixes, it is necessary to add prefixes in the BGP process for VRF, which will be advertised:

router bgp 65001
  vrf PROD
    address-family ipv4 unicast
      redistribute direct route-map VNI20000        ! Π’ Π΄Π°Π½Π½ΠΎΠΌ случаС анонсируСм прСфиксы ΠΏΠΎΠ΄ΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΠ΅ нСпосрСдствСнно ΠΊ Leaf Π² VNI 20000
route-map VNI20000 permit 10
  match ip address prefix-list VNI20000_OUT    ! Π£ΠΊΠ°Π·Ρ‹Π²Π°Π΅ΠΌ ΠΊΠ°ΠΊΠΎΠΉ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚ΡŒ prefix-list

ip prefix-list VNI20000_OUT seq 5 permit 192.168.20.0/24   ! Π£ΠΊΠ°Π·Ρ‹Π²Π°Π΅ΠΌ ΠΊΠ°ΠΊΠΈΠ΅ сСти Π±ΡƒΠ΄ΡƒΡ‚ ΠΏΠΎΠΏΠ°Π΄Π°Ρ‚ΡŒ Π² EVPN route-type 5

As a result, Update will be:

VxLAN factory. Part 2

Let's look at the BGP table. In addition to EVPN route-type 2,3, type 5 routes have appeared that contain information about the network number:

<......>
   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 10.255.1.11:3
* i[5]:[0]:[0]:[24]:[192.168.10.0]/224
                      10.255.1.10              0        100          0 ?
*>i                   10.255.1.10              0        100          0 ?

Route Distinguisher: 10.255.1.11:32777
* i[2]:[0]:[0]:[48]:[5001.0007.0007]:[0]:[0.0.0.0]/216
                      10.255.1.10                       100          0 i
*>i                   10.255.1.10                       100          0 i
* i[2]:[0]:[0]:[48]:[5001.0007.0007]:[32]:[192.168.10.10]/272
                      10.255.1.10                       100          0 i
*>i                   10.255.1.10                       100          0 i
* i[3]:[0]:[32]:[10.255.1.10]/88
                      10.255.1.10                       100          0 i
*>i                   10.255.1.10                       100          0 i

Route Distinguisher: 10.255.1.12:3
*>i[5]:[0]:[0]:[24]:[192.168.10.0]/224      ! EVPN route-type 5 с Π½ΠΎΠΌΠ΅Ρ€ΠΎΠΌ прСфикса
                      10.255.1.10              0        100          0 ?
* i
<.......>                   

The prefix also appeared in the routing table:

Leaf21# sh ip ro vrf PROD
192.168.10.0/24, ubest/mbest: 1/0
    *via 10.255.1.10%default, [200/0], 00:14:32, bgp-65001, internal, tag 65001  ! Π£Π΄Π°Π»Π΅Π½Π½Ρ‹ΠΉ прСфикс, доступный Ρ‡Π΅Ρ€Π΅Π· Leaf1/2(адрСс Next-hop = virtual IP ΠΌΠ΅ΠΆΠ΄Ρƒ ΠΏΠ°Ρ€ΠΎΠΉ VPC)
(evpn) segid: 99000 tunnelid: 0xaff010a encap: VXLAN      ! ΠŸΡ€Π΅Ρ„ΠΈΠΊΡ доступСн Ρ‡Π΅Ρ€Π΅Π· L3VNI 99000

192.168.10.10/32, ubest/mbest: 1/0
    *via 10.255.1.10%default, [200/0], 02:33:40, bgp-65001, internal, tag 65001
(evpn) segid: 99000 tunnelid: 0xaff010a encap: VXLAN

192.168.20.0/24, ubest/mbest: 1/0, attached
    *via 192.168.20.1, Vlan20, [0/0], 02:39:44, direct
192.168.20.1/32, ubest/mbest: 1/0, attached
    *via 192.168.20.1, Vlan20, [0/0], 02:39:44, local
192.168.20.20/32, ubest/mbest: 1/0, attached
    *via 192.168.20.20, Vlan20, [190/0], 02:35:46, hmm

This concludes the second part of a series of articles on VxLAN EVPN. In the next part, we will consider various options for routing between VRFs.

Fundamentals of IPv6 and how it differs from IPv4

Source: habr.com

Add a comment