summaryrefslogtreecommitdiffstats
path: root/bgpd/bgp_fsm.c
diff options
context:
space:
mode:
authorLoïc Sang <loic.sang@6wind.com>2024-06-19 16:19:22 +0200
committerLoïc Sang <loic.sang@6wind.com>2024-06-26 16:11:16 +0200
commite0ae285eb8beeef7b43bdadc073d8ae346eaeb6c (patch)
tree2c4779a930cc186d6a927c9e7b2f54f007d6b1e0 /bgpd/bgp_fsm.c
parentMerge pull request #16252 from chiragshah6/evpn_dev1 (diff)
downloadfrr-e0ae285eb8beeef7b43bdadc073d8ae346eaeb6c.tar.xz
frr-e0ae285eb8beeef7b43bdadc073d8ae346eaeb6c.zip
bgpd: avoid clearing routes for peers that were never established
Under heavy system load with many peers in passive mode and a large number of routes, bgpd can enter an infinite loop. This occurs while processing timeout BGP_OPEN messages, which prevents it from accepting new connections. The following log entries illustrate the issue: >bgpd[6151]: [VX6SM-8YE5W][EC 33554460] 3.3.2.224: nexthop_set failed, resetting connection - intf 0x0 >bgpd[6151]: [P790V-THJKS][EC 100663299] bgp_open_receive: bgp_getsockname() failed for peer: 3.3.2.224 >bgpd[6151]: [HTQD2-0R1WR][EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 3.3.2.224 ... repeating The issue occurs when bgpd handles a massive number of routes in the RIB while receiving numerous BGP_OPEN packets. If bgpd is overloaded, it fails to process these packets promptly, leading the remote peer to close the connection and resend BGP_OPEN packets. When bgpd eventually starts processing these timeout BGP_OPEN packets, it finds the TCP connection closed by the remote peer, resulting in "bgp_stop()" being called. For each timeout peer, bgpd must iterate through the routing table, which is time-consuming and causes new incoming BGP_OPEN packets to timeout, perpetuating the infinite loop. To address this issue, the code is modified to check if the peer has been established at least once before calling "bgp_clear_route_all()". This ensures that routes are only cleared for peers that had a successful session, preventing unnecessary iterations over the routing table for peers that never established a connection. With this change, BGP_OPEN timeout messages may still occur, but in the worst case, bgpd will stabilize. Before this patch, bgpd could enter a loop where it was unable to accpet any new connections. Signed-off-by: Loïc Sang <loic.sang@6wind.com>
Diffstat (limited to 'bgpd/bgp_fsm.c')
-rw-r--r--bgpd/bgp_fsm.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/bgpd/bgp_fsm.c b/bgpd/bgp_fsm.c
index 15cc5dbe2..d41ef8abb 100644
--- a/bgpd/bgp_fsm.c
+++ b/bgpd/bgp_fsm.c
@@ -1241,7 +1241,7 @@ void bgp_fsm_change_status(struct peer_connection *connection,
/* Transition into Clearing or Deleted must /always/ clear all routes..
* (and must do so before actually changing into Deleted..
*/
- if (status >= Clearing) {
+ if (status >= Clearing && (peer->established || peer == bgp->peer_self)) {
bgp_clear_route_all(peer);
/* If no route was queued for the clear-node processing,