Hey Habr. I continue the series of articles on VxLAN EVPN technology, which were written specifically for the launch of the course
In the last part, we achieved one broadcast domain built on top of a network fabric on a Nexus 9000v. However, this is not the whole range of tasks that need to be solved within the framework of the data center network. And today we will consider the following task - routing between networks or between VNIs.
Let me remind you that the Spine-Leaf topology is used:
To begin with, we will analyze how routing occurs and what features it has.
For understanding, let's simplify the logic diagram and add another VNI 20000 for Host-2. The result is:
How, in this case, can you transfer traffic from one Host to another?
There are two options:
- Keep information about all VNIs on all Leaf switches, then all routing will occur on the first Leaf in the network;
- Use dedicated - L3 VNI
The first way is simple and convenient. Since you only need to start all VNIs on all Leaf switches. However, running a few hundred or thousands of VNIs on the entire Leaf no longer seems like an easy task. Therefore, in the work it is used quite rarely.
We will analyze method 2, as more interesting and slightly more complicated, but giving more flexibility in setting up the factory.
Let's add "PROD" to the VRF topology. Let's add interface vlan 10 to it on the Leaf-11/12 pair and interface VLAN 20 on Leaf-21. VLAN 20 is associated with VNI 20000
vrf context PROD
rd auto ! Route Distinguisher Π½Π΅ ΠΏΡΠΈΠ½ΡΠΈΠΏΠΈΠ°Π»Π΅Π½ ΠΈ ΠΌΠΎΠΆΠ΅ΠΌ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ ΡΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½Π½ΡΠΉ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈ
address-family ipv4 unicast
route-target both auto ! ΡΠΊΠ°Π·ΡΠ²Π°Π΅ΠΌ Route-target Ρ ΠΊΠΎΡΠΎΡΡΠΌ Π±ΡΠ΄ΡΡ ΠΈΠΌΠΏΠΎΡΡΠΈΡΠΎΠ²Π°ΡΡΡΡ ΠΈ ΡΠΊΡΠΏΠΎΡΡΠΈΡΠΎΠ²Π°ΡΡΡΡ ΠΏΡΠ΅ΡΠΈΠΊΡΡ Π²/ΠΈΠ· VRF
vlan 20
vn-segment 20000
interface nve 1
member vni 20000
ingress-replication protocol bgp
interface Vlan10
no shutdown
vrf member PROD
ip address 192.168.20.1/24
fabric forwarding mode anycast-gateway
In order to use L3VNI, you need to create a new VLAN, associate it with the new VNI. The new VNI must be the same on all Leafs interested in VLAN 10 and 20 information.
vlan 99
vn-segment 99000
interface nve1
member vni 99000 associate-vrf ! Π‘ΠΎΠ·Π΄Π°Π΅ΠΌ L3 VNI
vrf context PROD
vni 99000 ! ΠΡΠΈΠ²ΡΠ·ΡΠ²Π°Π΅ΠΌ L3 VNI ΠΊ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΠΎΠΌΡ VRF
As a result, the diagram will look like this:
It remains to finish a little - add one more interface - interface vlan 99 in VRF PROD
interface Vlan99
no shutdown
vrf member PROD
ip forward ! ΠΠ° ΠΈΠ½ΡΠ΅ΡΡΠ΅ΠΉΡΠ΅ Π½Π΅ Π΄ΠΎΠ»ΠΆΠ½ΠΎ Π±ΡΡΡ IP. ΠΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΡΡΡ ΡΠΎΠ»ΡΠΊΠΎ Π΄Π»Ρ ΠΏΠ΅ΡΠ΅ΡΡΠ»ΠΊΠΈ ΠΏΠ°ΠΊΠ΅ΡΠΎΠ² ΠΌΠ΅ΠΆΠ΄Ρ Leaf
As a result, the logic of passing the frame from Host-1 to Host-2 is as follows:
- A frame sent by Host-1 arrives on a Leaf in VLAN 10, which is associated with VNI 10000;
- Leaf checks where the destination address is and finds it via L3 VNI on the second Leaf switch;
- As soon as the route to the destination address is found, the Leaf packs the frame into a header with the necessary L3VNI 99000 - and sends it towards the second Leaf;
- The second Leaf switch receives data from L3VNI 99000. Gets the original frame and transfers it to the required L2VNI 20000 and then to VLAN 20.
As a result of this work, L3VNI removes the need to keep information about all VNIs that are on the network on all Leaf switches.
As a result, when we send traffic from Host-1 to Host-2, the packet is packed inside VxLAN with the new VNI - 99000:
It remains to be seen how exactly Leaf-1 learns about the MAC address from another VNI. This also happens with the help of EVPN route-type 2 (MAC / IP).
The following shows the process of propagating a route about a prefix located in another VNI:
That is, addresses received from VNI 20000 have two RTs.
Let me remind you that the routes received from Update fall into the BGP table with the Route-target specified in the VRF settings (the process is somewhat more complicated, but we will not go into this article).
The RT itself is formed by the formula: AS:VNI (if automatic mode is used).
An example of RT formation in automatic and manual modes:
vrf context PROD
address-family ipv4 unicast
route-target import auto - Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΠΉ ΡΠ΅ΠΆΠΈΠΌ ΡΠ°Π±ΠΎΡΡ
route-target export 65001:20000 - ΡΡΡΠ½ΠΎΠΉ ΡΠ΅ΠΆΠΈΠΌ ΡΠΎΡΠΌΠΈΡΠΎΠ²Π°Π½ΠΈΡ RT
As a result, you can see above that prefixes from another VNI have two RT values.
One of them 65001:99000 is an additional L3 VNI. Since this VNI is the same on all Leafs and falls under our import rules in the VRF settings, the prefix gets into the BGP table, which can be seen from the output:
sh bgp l2vpn evpn
<.....>
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 10.255.1.11:32777 (L2VNI 10000)
*>l[2]:[0]:[0]:[48]:[5001.0007.0007]:[0]:[0.0.0.0]/216
10.255.1.10 100 32768 i
*>l[2]:[0]:[0]:[48]:[5001.0007.0007]:[32]:[192.168.10.10]/272
10.255.1.10 100 32768 i
*>l[3]:[0]:[32]:[10.255.1.10]/88
10.255.1.10 100 32768 i
Route Distinguisher: 10.255.1.21:32787
* i[2]:[0]:[0]:[48]:[5001.0008.0007]:[32]:[192.168.20.20]/272 ! ΠΡΠ΅ΡΠΈΠΊΡ ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΡΠΉ ΠΈΠ· VNI 20000
10.255.1.20 100 0 i
*>i 10.255.1.20 100 0 i
If we look more closely at the received update, we can see that this prefix has two RTs:
Leaf11# sh bgp l2vpn evpn 5001.0008.0007
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 10.255.1.21:32787
BGP routing table entry for [2]:[0]:[0]:[48]:[5001.0008.0007]:[32]:[192.168.20.2
0]/272, version 5164
Paths: (2 available, best #2)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not i
n HW
Path type: internal, path is valid, not best reason: Neighbor Address, no labeled nexthop
AS-Path: NONE, path sourced internal to AS
10.255.1.20 (metric 81) from 10.255.1.102 (10.255.1.102)
Origin IGP, MED not set, localpref 100, weight 0
Received label 20000 99000 ! ΠΠ²Π° label Π΄Π»Ρ ΡΠ°Π±ΠΎΡΡ VxLAN
Extcommunity: RT:65001:20000 RT:65001:99000 SOO:10.255.1.20:0 ENCAP:8 ! ΠΠ²Π° Π·Π½Π°ΡΠ΅Π½ΠΈΡ Route-target, Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅, ΠΊΠΎΡΠΎΡΡΡ
Π΄ΠΎΠ±Π°Π²ΠΈΠ»ΠΈ Π΄Π°Π½Π½ΡΠΉ ΠΏΡΠ΅ΡΠΈΠΊΡ
Router MAC:5001.0005.0007
Originator: 10.255.1.21 Cluster list: 10.255.1.102
<......>
In the routing table on Leaf-1, you can also see the prefix 192.168.20.20/32:
Leaf11# sh ip route vrf PROD
192.168.10.0/24, ubest/mbest: 1/0, attached
*via 192.168.10.1, Vlan10, [0/0], 01:29:28, direct
192.168.10.1/32, ubest/mbest: 1/0, attached
*via 192.168.10.1, Vlan10, [0/0], 01:29:28, local
192.168.10.10/32, ubest/mbest: 1/0, attached
*via 192.168.10.10, Vlan10, [190/0], 01:27:22, hmm
192.168.20.20/32, ubest/mbest: 1/0 ! ΠΠ΄ΡΠ΅Ρ Host-2
*via 10.255.1.20%default, [200/0], 01:20:20, bgp-65001, internal, tag 65001 ! ΠΠΎΡΡΡΠΏΠ½ΡΠΉ ΡΠ΅ΡΠ΅Π· Leaf-2
(evpn) segid: 99000 tunnelid: 0xaff0114 encap: VXLAN ! Π§Π΅ΡΠ΅Π· VNI 99000
Notice the missing primary prefix 192.168.20.0/24 in the routing table?
That's right, he's not there. That is, remote Leafs receive information only about the hosts that are on your network. And this is the correct behavior. Above, in all updates, you can see that information comes with the content of MAC / IP. There are no prefixes to speak of.
This is the Host Mobility Manager (HMM) protocol, which fills the ARP table from which the BGP table is further filled (we will omit this process within the framework of this article). Based on the information received from the HMM, route-type 2 EVPNs are formed (transmitted by MAC / IP).
However, what if there is a need to pass information about a prefix?
For this type of information, there is EVPN route-type 5 - it allows you to send prefixes via address-family l2vpn evpn (this type of route at the time of this writing is only in the draft version
To transfer prefixes, it is necessary to add prefixes in the BGP process for VRF, which will be advertised:
router bgp 65001
vrf PROD
address-family ipv4 unicast
redistribute direct route-map VNI20000 ! Π Π΄Π°Π½Π½ΠΎΠΌ ΡΠ»ΡΡΠ°Π΅ Π°Π½ΠΎΠ½ΡΠΈΡΡΠ΅ΠΌ ΠΏΡΠ΅ΡΠΈΠΊΡΡ ΠΏΠΎΠ΄ΠΊΠ»ΡΡΠ΅Π½ΠΈΠ΅ Π½Π΅ΠΏΠΎΡΡΠ΅Π΄ΡΡΠ²Π΅Π½Π½ΠΎ ΠΊ Leaf Π² VNI 20000
route-map VNI20000 permit 10
match ip address prefix-list VNI20000_OUT ! Π£ΠΊΠ°Π·ΡΠ²Π°Π΅ΠΌ ΠΊΠ°ΠΊΠΎΠΉ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ prefix-list
ip prefix-list VNI20000_OUT seq 5 permit 192.168.20.0/24 ! Π£ΠΊΠ°Π·ΡΠ²Π°Π΅ΠΌ ΠΊΠ°ΠΊΠΈΠ΅ ΡΠ΅ΡΠΈ Π±ΡΠ΄ΡΡ ΠΏΠΎΠΏΠ°Π΄Π°ΡΡ Π² EVPN route-type 5
As a result, Update will be:
Let's look at the BGP table. In addition to EVPN route-type 2,3, type 5 routes have appeared that contain information about the network number:
<......>
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 10.255.1.11:3
* i[5]:[0]:[0]:[24]:[192.168.10.0]/224
10.255.1.10 0 100 0 ?
*>i 10.255.1.10 0 100 0 ?
Route Distinguisher: 10.255.1.11:32777
* i[2]:[0]:[0]:[48]:[5001.0007.0007]:[0]:[0.0.0.0]/216
10.255.1.10 100 0 i
*>i 10.255.1.10 100 0 i
* i[2]:[0]:[0]:[48]:[5001.0007.0007]:[32]:[192.168.10.10]/272
10.255.1.10 100 0 i
*>i 10.255.1.10 100 0 i
* i[3]:[0]:[32]:[10.255.1.10]/88
10.255.1.10 100 0 i
*>i 10.255.1.10 100 0 i
Route Distinguisher: 10.255.1.12:3
*>i[5]:[0]:[0]:[24]:[192.168.10.0]/224 ! EVPN route-type 5 Ρ Π½ΠΎΠΌΠ΅ΡΠΎΠΌ ΠΏΡΠ΅ΡΠΈΠΊΡΠ°
10.255.1.10 0 100 0 ?
* i
<.......>
The prefix also appeared in the routing table:
Leaf21# sh ip ro vrf PROD
192.168.10.0/24, ubest/mbest: 1/0
*via 10.255.1.10%default, [200/0], 00:14:32, bgp-65001, internal, tag 65001 ! Π£Π΄Π°Π»Π΅Π½Π½ΡΠΉ ΠΏΡΠ΅ΡΠΈΠΊΡ, Π΄ΠΎΡΡΡΠΏΠ½ΡΠΉ ΡΠ΅ΡΠ΅Π· Leaf1/2(Π°Π΄ΡΠ΅Ρ Next-hop = virtual IP ΠΌΠ΅ΠΆΠ΄Ρ ΠΏΠ°ΡΠΎΠΉ VPC)
(evpn) segid: 99000 tunnelid: 0xaff010a encap: VXLAN ! ΠΡΠ΅ΡΠΈΠΊΡ Π΄ΠΎΡΡΡΠΏΠ΅Π½ ΡΠ΅ΡΠ΅Π· L3VNI 99000
192.168.10.10/32, ubest/mbest: 1/0
*via 10.255.1.10%default, [200/0], 02:33:40, bgp-65001, internal, tag 65001
(evpn) segid: 99000 tunnelid: 0xaff010a encap: VXLAN
192.168.20.0/24, ubest/mbest: 1/0, attached
*via 192.168.20.1, Vlan20, [0/0], 02:39:44, direct
192.168.20.1/32, ubest/mbest: 1/0, attached
*via 192.168.20.1, Vlan20, [0/0], 02:39:44, local
192.168.20.20/32, ubest/mbest: 1/0, attached
*via 192.168.20.20, Vlan20, [190/0], 02:35:46, hmm
This concludes the second part of a series of articles on VxLAN EVPN. In the next part, we will consider various options for routing between VRFs.
Source: habr.com