How to troubleshoot domestic IPsec VPN. Part 1

How to troubleshoot domestic IPsec VPN. Part 1

Situation

Day off. I drink coffee. The student set up a VPN connection between two points and disappeared. I check: the tunnel really exists, but there is no traffic in the tunnel. The student does not answer calls.

I put the kettle on and dive into the troubleshooting S-Terra Gateway. I share my experience and methodology.

Initial data

Two geographically separated sites are connected by a GRE tunnel. GRE needs to be encrypted:

How to troubleshoot domestic IPsec VPN. Part 1

I check the performance of the GRE tunnel. To do this, I start ping from device R1 to the GRE interface of device R2. This is the targeted traffic for encryption. No answer:

root@R1:~# ping 1.1.1.2 -c 4
PING 1.1.1.2 (1.1.1.2) 56(84) bytes of data.

--- 1.1.1.2 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3057ms

I look at the logs on Gate1 and Gate2. The log happily reports that the IPsec tunnel has successfully come up, no problems:

root@Gate1:~# cat /var/log/cspvpngate.log
Aug  5 16:14:23 localhost  vpnsvc: 00100119 <4:1> IPSec connection 5 established, traffic selector 172.17.0.1->172.16.0.1, proto 47, peer 10.10.10.251, id "10.10.10.251", Filter 
IPsec:Protect:CMAP:1:LIST, IPsecAction IPsecAction:CMAP:1, IKERule IKERule:CMAP:1

In the IPsec statistics of the tunnel on Gate1, I see that the tunnel really exists, but the Rcvd counter is reset to zero:

root@Gate1:~# sa_mgr show
ISAKMP sessions: 0 initiated, 0 responded

ISAKMP connections:
Num Conn-id (Local Addr,Port)-(Remote Addr,Port) State Sent Rcvd
1 3 (10.10.10.251,500)-(10.10.10.252,500) active 1070 1014

IPsec connections:
Num Conn-id (Local Addr,Port)-(Remote Addr,Port) Protocol Action Type Sent Rcvd
1 3 (172.16.0.1,*)-(172.17.0.1,*) 47 ESP tunn 480 0

I troubleshoot C-Terra like this: I'm looking for where the target packets are lost on the way from R1 to R2. In the process (spoiler) I will find an error.

Troubleshooting

Step 1: What Gate1 Gets From R1

I use the built-in packet sniffer - tcpdump. I run the sniffer on the internal (Gi0/1 in Cisco-like notation or eth1 in Debian OS notation) interface:

root@Gate1:~# tcpdump -i eth1

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
14:53:38.879525 IP 172.16.0.1 > 172.17.0.1: GREv0, key=0x1, length 92: IP 1.1.1.1 > 1.1.1.2: ICMP echo request, id 2083, seq 1, length 64
14:53:39.896869 IP 172.16.0.1 > 172.17.0.1: GREv0, key=0x1, length 92: IP 1.1.1.1 > 1.1.1.2: ICMP echo request, id 2083, seq 2, length 64
14:53:40.921121 IP 172.16.0.1 > 172.17.0.1: GREv0, key=0x1, length 92: IP 1.1.1.1 > 1.1.1.2: ICMP echo request, id 2083, seq 3, length 64
14:53:41.944958 IP 172.16.0.1 > 172.17.0.1: GREv0, key=0x1, length 92: IP 1.1.1.1 > 1.1.1.2: ICMP echo request, id 2083, seq 4, length 64

I see that Gate1 receives packets from R1 GRE. I move on.

Step 2. What Gate1 does with GRE packets

Using the klogview utility, I can see what happens with GRE packets inside the C-Terra VPN driver:

root@Gate1:~# klogview -f 0xffffffff

filtration result for out packet 172.16.0.1->172.17.0.1, proto 47, len 112, if eth0: chain 4 "IPsecPolicy:CMAP", filter 8, event id IPsec:Protect:CMAP:1:LIST, status PASS
encapsulating with SA 31: 172.16.0.1->172.17.0.1, proto 47, len 112, if eth0
passed out packet 10.10.10.251->10.10.10.252, proto 50, len 160, if eth0: encapsulated

I see that the target GRE traffic (proto 47) 172.16.0.1 -> 172.17.0.1 fell (PASS) under the LIST encryption rule in the CMAP cryptomap and was encrypted (encapsulated). Next, the packet was routed (passed out). There is no return traffic in klogview output.

Checking access lists on device Gate1. I see one LIST access list, which determines the target traffic for encryption, which means that the ME rules are not configured:

Gate1#show access-lists
Extended IP access list LIST
    10 permit gre host 172.16.0.1 host 172.17.0.1

Conclusion: the problem is not on the Gate1 device.

More about klogview

The VPN driver handles all network traffic, not just those that need to be encrypted. Here are the messages seen in klogview if the VPN driver processed the network traffic and sent it in plain text:

root@R1:~# ping 172.17.0.1 -c 4

root@Gate1:~# klogview -f 0xffffffff

filtration result for out packet 172.16.0.1->172.17.0.1, proto 1, len 84, if eth0: chain 4 "IPsecPolicy:CMAP": no match
passed out packet 172.16.0.1->172.17.0.1, proto 1, len 84, if eth0: filtered

I see that ICMP traffic (proto 1) 172.16.0.1->172.17.0.1 did not fall (no match) into the encryption rules of the CMAP cryptomap. The packet was routed (passed out) in the clear.

Step 3. What does Gate2 get from Gate1

I start the sniffer on the WAN (eth0) interface Gate2:

root@Gate2:~# tcpdump -i eth0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
16:05:45.104195 IP 10.10.10.251 > 10.10.10.252: ESP(spi=0x30088112,seq=0x1), length 140
16:05:46.093918 IP 10.10.10.251 > 10.10.10.252: ESP(spi=0x30088112,seq=0x2), length 140
16:05:47.117078 IP 10.10.10.251 > 10.10.10.252: ESP(spi=0x30088112,seq=0x3), length 140
16:05:48.141785 IP 10.10.10.251 > 10.10.10.252: ESP(spi=0x30088112,seq=0x4), length 140

I see that Gate2 is receiving ESP packets from Gate1.

Step 4. What Gate2 does with ESP packets

I run the klogview utility on Gate2:

root@Gate2:~# klogview -f 0xffffffff
filtration result for in packet 10.10.10.251->10.10.10.252, proto 50, len 160, if eth0: chain 17 "FilterChain:L3VPN", filter 21, status DROP
dropped in packet 10.10.10.251->10.10.10.252, proto 50, len 160, if eth0: firewall

I see that ESP packets (proto 50) were dropped (DROP) by a rule (L3VPN) of the firewall (firewall). I am convinced that Gi0 / 0 is really bound to the L3VPN access list:

Gate2#show ip interface gi0/0
GigabitEthernet0/0 is up, line protocol is up
  Internet address is 10.10.10.252/24
  MTU is 1500 bytes
  Outgoing access list is not set
  Inbound  access list is L3VPN

Found the problem.

Step 5: What's wrong with the access list

I look at what the L3VPN access list is:

Gate2#show access-list L3VPN
Extended IP access list L3VPN
    10 permit udp host 10.10.10.251 any eq isakmp
    20 permit udp host 10.10.10.251 any eq non500-isakmp
    30 permit icmp host 10.10.10.251 any

I see that ISAKMP packets are allowed, so an IPsec tunnel is being established. But there is no permissive rule for ESP. Apparently, the student confused icmp and esp.

Editing the access list:

Gate2(config)#
ip access-list extended L3VPN
no 30
30 permit esp host 10.10.10.251 any

Step 6. I check the performance

First of all, I make sure that the L3VPN access list is correct:

Gate2#show access-list L3VPN
Extended IP access list L3VPN
    10 permit udp host 10.10.10.251 any eq isakmp
    20 permit udp host 10.10.10.251 any eq non500-isakmp
    30 permit esp host 10.10.10.251 any

Now I launch targeted traffic from the R1 device:

root@R1:~# ping 1.1.1.2 -c 4
PING 1.1.1.2 (1.1.1.2) 56(84) bytes of data.
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=35.3 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=3.01 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=2.65 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=2.87 ms

--- 1.1.1.2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3006ms
rtt min/avg/max/mdev = 2.650/10.970/35.338/14.069 ms

Victory. GRE tunnel established. The incoming traffic counter in the IPsec statistics is non-zero:

root@Gate1:~# sa_mgr show
ISAKMP sessions: 0 initiated, 0 responded

ISAKMP connections:
Num Conn-id (Local Addr,Port)-(Remote Addr,Port) State Sent Rcvd
1 3 (10.10.10.251,500)-(10.10.10.252,500) active 1474 1350

IPsec connections:
Num Conn-id (Local Addr,Port)-(Remote Addr,Port) Protocol Action Type Sent Rcvd
1 4 (172.16.0.1,*)-(172.17.0.1,*) 47 ESP tunn 1920 480

On the Gate2 gateway, in the klogview output, messages appeared that the target traffic 172.16.0.1-> 172.17.0.1 was successfully (PASS) decapsulated by the LIST rule in the CMAP cryptomap:

root@Gate2:~# klogview -f 0xffffffff
filtration result for in packet 172.16.0.1->172.17.0.1, proto 47, len 112, if eth0: chain 18 "IPsecPolicy:CMAP", filter 25, event id IPsec:Protect:CMAP:1:LIST, status PASS
passed in packet 172.16.0.1->172.17.0.1, proto 47, len 112, if eth0: decapsulated

Results

The student ruined the day off.
Careful with the rules of the ME.

Anonymous engineer
t.me/anonymous_engineer


Source: habr.com

Add a comment