How the PIM protocol works

The PIM protocol is a set of protocols for multicast transmission in a network between routers. Neighbor relationships are built in the same way as in the case of dynamic routing protocols. PIMv2 sends Hello messages every 30 seconds to the reserved multicast address 224.0.0.13 ( All-PIM-Routers ). The message contains Hold Timers - usually equal to 3.5 * Hello Timer, that is, 105 seconds by default.
How the PIM protocol works
PIM uses two main modes of operation - Dense and Sparse mode. Let's start with Dense mode.
Source-Based Distribution Trees.
Dense-mode mode is advisable to use in case of a large number of clients of different multicast groups. When a router receives mulcast traffic, the first thing it does is check it for an RPF rule. RPF - this rule is used to check the multicast source with a unicast routing table. It is necessary that traffic comes to the interface behind which this host is hidden according to the version of the unicast routing table. This mechanism solves the problem of a loop during multicast transmission.
How the PIM protocol works
R3 learns the multicast source from the multicast message ( Source IP ) and checks the two streams from R1 and R2 against its own unicast table. The stream from the interface indicated by the table ( R1 to R3 ) will be transferred further, and the stream from R2 will be dropped, since in order to get to the multicast source, you need to send packets over S0/1.
The question is, what happens if you have two equivalent routes with the same metric? In this case, the router will choose next-hop for these routes. Whoever has a higher ip address wins. If you need to change this behavior, you can use ECMP. More here.
After checking the RPF rule, the router sends a multicast packet to all its PIM neighbors, except for the one from which this packet was received. Other PIM routers repeat this process. The path that the multicast packet has traveled from the source to the final recipients forms a tree called source-based distribution tree, shortest-path tree (SPT), source tree. Three different names, choose any.
How to solve the problem with the fact that some kind of multicast stream did not give up to some routers and there is no one to send it to, but the upstream router sends it. For this, the Prune mechanism was invented.
Prune message.
For example, R2 will continue to send a multicast to R3, even though R3 drops it according to the RPF rule. Why upload a channel? R3 sends a PIM Prune Message and R2, upon receiving this message, will remove the S0/1 interface from the list ( outgoing interface list ) for this flow, the list of interfaces from which this traffic should be sent.

The following is a more formal definition of a PIM Prune message:
The PIM Prune message is sent by one router to a second router to cause the second router to remove the link on which the Prune is received from a particular (S,G) SPT.

Upon receiving Prune's message, R2 sets a Prune timer of 3 minutes. After three minutes, it will start sending traffic again until it receives another Prune message. It's in PIMv1.
And PIMv2 added a State Refresh timer (60 seconds by default). As soon as a Prune message has been sent from R3, this timer is started on R3. When this timer expires, R3 will send a State Refresh message that will reset the 3 minute Prune Timer on R2 for that group.
Reasons for sending the Prune message:

  • When a multicast packet fails RPF check.
  • When there are no locally connected clients that have requested a multicast group ( IGMP Join) and there are no PIM neighbors to which multicast traffic can be sent ( Non-prune Interface).

Graft message.
Imagine that R3 did not want traffic from R2, sent Prune and received a multicast from R1. But suddenly, the channel fell between R1-R3 and R3 was left without multicast. You can wait 3 minutes until the Prune Timer expires on R2. 3 minutes to wait a long time, so as not to wait, you need to send a message that will instantly bring the given interface S0 / 1 to R2 out of the pruned state. Such a message would be a Graft message. Upon receiving a Graft message, R2 will send a Graft-ACK in response.
Prune Override.
How the PIM protocol works
Let's look at this diagram. R1 multicasts to a segment with two routers. R3 receives and broadcasts traffic, R2 receives it, but there is no one to broadcast traffic to it. It sends Prune a message to R1 in this segment. R1 should remove Fa0/0 from the list and stop broadcasting to this segment, but what will happen to R3? And R3 is in the same segment, also received this message from Prune and understood the tragedy of the situation. Before R1 stops broadcasting, it sets a timer of 3 seconds and stops broadcasting after 3 seconds. 3 seconds is exactly how long R3 has to not lose his multicast. Therefore, R3, as soon as possible, sends a Pim Join message for this group and R1 no longer thinks about stopping broadcasting. About Join messages below.
assert message.
How the PIM protocol works
Let's imagine the following situation: two routers are broadcasting to the same network at once. They receive the same stream from the source, and both broadcast it to the same network behind the e0 interface. Therefore, they need to determine who will be the one and only broadcaster for this network. Assert messages are used for this. When R2 and R3 detect a duplication of multicast traffic, that is, a multicast arrives at R2 and R3, which they themselves broadcast, the routers understand that something is wrong here. In this case, routers send Assert messages that include Administrative Distance and the route metric by which the multicast source is reached - 10.1.1.10. The winner is determined as follows:

  1. The one who has lower AD.
  2. If AD are equal, then who has a lower metric.
  3. If there is equality here, then the one who has a higher IP in the network to which they broadcast this multicast.

The winner of this vote becomes the Designated Router. Pim Hellos are also used to select DR. At the beginning of the article, a PIM Hello message was shown, where you can see the DR field. The one with the highest IP address on this link wins.
Useful sign:
How the PIM protocol works
MROUTE Table.
After an initial look at how the PIM protocol works, we need to understand how to work with the multicast routing table. The mroute table stores information about which streams were requested by clients and which streams are streaming from multicast servers.
For example, when an IGMP Membership Report or PIM Join is received on some interface, an entry like ( *, G ) is added to the routing table:
How the PIM protocol works
This entry means that a traffic request was received with the address 238.38.38.38. The DC flag means that the multicast will work in Dense mode and C means that the recipient is directly connected to the router, that is, the router received the IGMP Membership Report, and PIM Join.
If there is an entry of type (S, G) it means that we have a multicast stream:
How the PIM protocol works
In the S field - 192.168.1.11, we have the IP address of the multicast source, it will be checked by the RPF rule. In case of problems, the first step is to check the unicast table for the route to the source. In the Incoming Interface field, specifies the interface on which the multicast arrives. In a unicast routing table, the route to the source must refer to the interface specified here. The Outgoing Interface indicates where the multicast will be redirected. If it is empty, then the router has not received requests for this traffic. More information about all flags can be found here.
PIM Sparse-mode.
The Sparse-mode strategy is the opposite of Dense-mode. When Sparse-mode receives multicast traffic, it will only send traffic through those interfaces where there were requests for this stream, such as Pim Join or IGMP Report messages requesting this traffic.
Similar elements for SM and DM:

  • Neighbor relationships are built in the same way as in PIM DM.
  • The RPF rule works.
  • The choice of DR is similar.
  • Prune Overrides mechanism and Assert messages are similar.

To control who, where and what kind of multicast traffic is needed in the network, you need a common information center. We will have Rendezvous Point ( RP ) as such a center. Anyone who wants some kind of multicast traffic or someone has started receiving multicast traffic from a source, then he sends it to the RP.
When the RP receives multicast traffic, it will send it to those routers that previously requested this traffic.
How the PIM protocol works
Imagine a topology where RP is R3. As soon as R1 receives traffic from S1, it encapsulates this multicast packet into a unicast PIM Register message and sends it to the RP. How does he know who RP is? In this case, it is configured statically, and we will talk about the dynamic configuration of RP later.

ip pim rp-address 3.3.3.3

RP will look - was there any information from someone who would like to receive this traffic? Let's assume it wasn't. Then RP will send R1 a PIM Register-Stop message, which means that no one needs this multicast, registration is denied. R1 will not send multicast. But the multicast source host will send it, so R1, upon receiving Register-Stop, will start the Register-Suppression timer of 60 seconds. 5 seconds before this timer expires, R1 will send an empty Register message with a Null-Register bit (i.e. no encapsulated multicast packet) to the RP. RP, in turn, will act like this:

  • If there were no recipients, then it will respond with a Register-Stop message.
  • If recipients have appeared, then he will not answer it in any way. R1, having not received a refusal for its registration within 5 seconds, will be delighted and will send a Register message with an encapsulated multicast to the RP.

How multicast reaches RP seems to have been sorted out, now we will try to answer the question of how RP brings traffic to recipients. Here it is necessary to introduce a new concept - root-path tree (RPT). The RPT is a tree rooted in RPs, growing towards recipients, branching out at each PIM-SM router. The RP creates it by receiving PIM Join messages and adds a new branch to the tree. And so, does each downstream router. The general rule looks like this:

  • When a PIM-SM router receives a PIM Join message on any interface other than the interface that the RP is hiding behind, it adds a new branch to the tree.
  • A branch is also added when the PIM-SM router receives an IGMP Membership Report from a directly connected host.

Imagine that we have a multicast client on router R5 for group 228.8.8.8. As soon as R5 receives the IGMP Membership Report from the host, R5 sends a PIM Join towards the RP, and itself adds an interface looking at the host to the tree. Next, R4 receives a PIM Join from R5, adds a Gi0/1 interface to the tree, and sends a PIM Join towards RP. Finally, RP ( R3 ) receives a PIM Join and adds Gi0/0 to the tree. Thus, the registration of the recipient of the multicast is obtained. We are building a tree with the root R3-Gi0/0 β†’ R4-Gi0/1 β†’ R5-Gi0/0.
After that, a PIM Join will be sent to R1 and R1 will start sending multicast traffic. It is important to note that if the host requested traffic before the multicast broadcast began, then the RP will not send a PIM Join and will not send anything to the R1 side at all.
If suddenly a multicast is being sent, the host will no longer want to receive it, as soon as the RP receives a PIM Prune on the Gi0 / 0 interface, it will immediately send a PIM Register-Stop directly to R1, and then a PIM Prune message through the Gi0 / 1 interface. The PIM Register-stop is sent by Unicast to the address from which the PIM Register was received.
As we said before, as soon as the router sends a PIM Join to another, for example, R5 to R4, then an entry is added to R4:
How the PIM protocol works
And the timer is started, which should reset this timer R5 constantly PIM Join messages constantly, otherwise R4 will be excluded from the outgoing list. R5 will send every 60 PIM Join messages.
Shortest Path Tree Switchover.
We will add an interface between R1 and R5, see how traffic will flow with this topology.
How the PIM protocol works
Let's say that traffic was sent and received according to the old R1-R2-R3-R4-R5 scheme, and here we connected and configured the interface between R1 and R5.
First of all, we have to rebuild the unicast routing table on R5 and now the 192.168.1.0/24 network is reached through the R5 Gi0/2 interface. Now R5, when receiving a multicast on the Gi0/1 interface, understands that the RPF rule is not satisfied and it would be more logical to receive a multicast on Gi0/2. It should disconnect from the RPT and build a shorter tree called the Shortest-Path Tree (SPT). To do this, he sends PIM Join to R0 via Gi2 / 1 and R1 starts sending multicast also via Gi0 / 2. Now R5 needs to unsubscribe from RPT so as not to receive two copies. To do this, he sends a message to Prune indicating the ip address of the source and inserting a special bit - RPT-bit. This means that I do not need to send traffic, I have a better tree here. The RP also sends messages to the R1 side of the PIM Prune, but does not send a Register-Stop message. Another feature: R5 will now send PIM Prune to RP all the time, as R1 keeps sending PIM Register to RP every minute. RP until there are new people who want this traffic will refuse it. R5 notifies the RP that it continues to receive multicast via SPT.
Dynamic RP search.
AutoRP.

This technology is proprietary from Cisco and not very popular, but still alive. The work of Auto-RP consists of two main stages:
1) RP sends RP-Announce messages to the reserved address - 224.0.1.39, announcing itself as an RP either for all or for certain groups. This message is sent every minute.
2) An RP mapping agent is needed, which will send RP-Discovery messages indicating for which groups which RP it is necessary to listen. It is from this message that normal PIM routers will determine the RP for themselves. The Mapping Agent can be either an RP router itself or a separate PIM router. RP-Discovery is sent to 224.0.1.40 with a one minute timer.
Let's look at the process in more detail:
Set up R3 as RP:

ip pim send-rp-announce loopback 0 scope 10

R2 as mapping agent:

ip pim send-rp-discovery loopback 0 scope 10

And on all the rest we will expect RP through Auto-RP:

ip pim autorp listener

Once we set up R3, it will start sending RP-Announce:
How the PIM protocol works
And R2, after setting up the mapping agent, will start waiting for the RP-Announce message. Only when it finds at least one RP will it start sending RP-Discovery:
How the PIM protocol works
Thus, as soon as normal routers ( PIM RP Listener ) receive this message, they will know where to look for the RP.
One of the main problems with Auto-RP is that in order to receive RP-Announce and RP-Discovery messages, you need to send PIM Joins to addresses 224.0.1.39-40, and in order to send, you need to know where the RP is located. The classic chicken and egg problem. To solve this problem, the PIM Sparse-Dense-Mode was invented. If the router does not know the RP, then it works in Dense-mode, if it knows, then in Sparse-mode. When PIM Sparse-mode and the ip pim autorp listener command are configured on the interfaces of ordinary routers, the router will operate in Dense-mode only for multicasting the Auto-RP protocol directly ( 224.0.1.39-40 ).
BootStrap Router (BSR).
This feature works similar to Auto-RP. Each RP sends a message to the mapping agent, which collects mapping information and then tells all other routers. Let's describe the process in a similar way to Auto-RP:
1) Once we have configured R3 as a candidate to be an RP, the command:

ip pim rp-candidate loopback 0

Then R3 will not do anything, in order to start sending special messages, he first needs to find a mapping agent. Thus, we pass to the second step.
2) Set up R2 as a mapping agent:

ip pim bsr-candidate loopback 0

R2 starts sending PIM Bootstrap messages, where it indicates itself as a mapping agent:
How the PIM protocol works
This message is sent to the address 224.0.013, which the PIM protocol uses for other messages as well. He sends them in all directions and therefore there is no chicken and egg problem, as was the case in Auto-RP.
3) As soon as the RP receives a message from the BSR of the router, it will immediately send a unicast message to the address of the BSR of the router:
How the PIM protocol works
After that, BSR, having received information about RPs, will send them by multicast to the address 224.0.0.13, which is listened to by all PIM routers. Therefore, the analogue of the command ip pim autorp listener for normal routers is not in the BSR.
Anycast RP with Multicast Source Discovery Protocol (MSDP).
Auto-RP and BSR allow us to distribute the load on the RP as follows: Each multicast group has only one active RP. It will not be possible to make load balancing for one multicast group of several RPs. MSDP does this by issuing RPs to routers with the same IP address with a mask of 255.255.255.255. MSDP learns information using one of the methods: statics, Auto-RP, or BSR.
How the PIM protocol works
In the picture we have an Auto-RP configuration with MSDP. Both RPs are configured with ip address 172.16.1.1/32 on Loopback 1 interface and used for all groups. In RP-Announce, both routers announce themselves by referring to this address. The Auto-RP mapping agent, having received the information, sends RP-Discovery about the RP with the address 172.16.1.1/32. About the network 172.16.1.1/32, we tell routers using IGP and, respectively. Thus, PIM routers request or register flows from the RP that is listed as next-hop on the route to network 172.16.1.1/32. The MSDP protocol itself is designed for the RPs themselves to exchange messages about information about multicast.
Consider the following topology:
How the PIM protocol works
Switch6 broadcasts traffic to the address 238.38.38.38 and so far only RP-R1 knows about it. Here Switch7 and Switch8 requested this group. Routers R5 and R4 will send a PIM Join to R1 and R3, respectively. Why? The route until 13.13.13.13 for R5 will refer to R1 by the IGP metric, just like for R4.
RP-R1 knows about the stream and will start broadcasting it towards R5, but R4 does not know anything about it, since R1 will not send it just like that. Therefore MSDP is needed. Set it up for R1 and R5:

ip msdp peer 3.3.3.3 connect-source Loopback1 on R1

ip msdp peer 1.1.1.1 connect-source Loopback3 on R3

They will raise a session between each other and, upon receipt of any stream, they will report it to their RP neighbor.
As soon as RP-R1 receives a stream from Switch6, it will immediately send a MSDP Source-Active message with Unicast, which will contain information like (S, G) - information about the source and destination of the multicast. Now, when RP-R3 knows that such a source as Switch6, when it receives a request from R4 for this stream, it will send PIM Join towards Switch6, guided by the routing table. Therefore, having received such a PIM Join, R1 will start sending traffic towards RP-R3.
MSDP runs over TCP, RPs send keepalive messages to each other to check for liveness. The timer is 60 seconds.
The function of splitting MSDP peers into different domains remains unclear, since Keepalive and SA messages do not indicate belonging to any domain. Also in this topology, the configuration was tested with the indication of different domains - there was no difference in the work.
If someone can clarify, I'll be happy to read in the comments.

Source: habr.com

Add a comment