summaryrefslogtreecommitdiffstats
path: root/zebra (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge pull request #17023 from donaldsharp/dplane_problemsbase_10.2Russ White2024-10-086-44/+32
|\ | | | | zebra: Allow dplane to pass larger number of nexthops down to dataplane
| * *: Allow 16 bit size for nexthopsDonald Sharp2024-10-086-44/+32
| | | | | | | | | | | | | | | | | | Currently FRR is limiting the nexthop count to a uint8_t not a uint16_t. This leads to issues when the nexthop count is 256 which results in the count to overflow to 0 causing problems in the code. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Do not retry in 30 seconds on pw reachability failureDonald Sharp2024-10-081-1/+8
| | | | | | | | | | | | | | | | | | | | | | Currently the zebra pw code has setup a retry to install the pw after 30 seconds when it is decided that reachability to the pw is gone. This causes a failure mode where the pw code just goes and re-installs the pw after 30 seconds in the non-reachability case. Instead it should just be reinstalling after reachability is restored. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Move pw status settting until after we get resultsDonald Sharp2024-10-083-29/+23
|/ | | | | | | | | | Currently the pw code sets the status of the pw for install and uninstall immediately when notifying the dplane. This is incorrect in that we do not actually know the status at this point in time. When we get the result is when to set the status. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* Merge pull request #17013 from dksharp5/removal_functionsDonatas Abraitis2024-10-076-282/+0
|\ | | | | Removal functions
| * lib,zebra: remove unused ZEBRA_VRF_UNREGISTERDonna Sharp2024-10-071-17/+0
| | | | | | | | Signed-off-by: Donna Sharp <dksharp5@gmail.com>
| * zebra: remove unsued function from tc_netlink.cDonna Sharp2024-10-071-21/+0
| | | | | | | | Signed-off-by: Donna Sharp <dksharp5@gmail.com>
| * zebra: remove unused function from if_netlink.cDonna Sharp2024-10-073-223/+0
| | | | | | | | Signed-off-by: Donna Sharp <dksharp5@gmail.com>
| * zebra: remove unused function from tc_netlink.cDonna Sharp2024-10-072-21/+0
| | | | | | | | Signed-off-by: Donna Sharp <dksharp5@gmail.com>
* | zebra: remove unused function rib_lookup_ipv4Donna Sharp2024-10-072-42/+0
|/ | | | Signed-off-by: Donna Sharp <dksharp5@gmail.com>
* Merge pull request #16800 from donaldsharp/nhg_reuse_intf_down_upRuss White2024-10-044-10/+183
|\ | | | | Nhg reuse intf down up
| * zebra: Attempt to reuse NHG after interface up and route reinstallDonald Sharp2024-09-163-4/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous commit modified zebra to reinstall the singleton nexthops for a nexthop group when a interface event comes up. Now let's modify zebra to attempt to reuse the nexthop group when this happens and the upper level protocol resends the route down with that. Only match if the protocol is the same as well as the instance and the nexthop groups would match. Here is the new behavior: eva(config)# do show ip route 9.9.9.9/32 Routing entry for 9.9.9.9/32 Known via "static", distance 1, metric 0, best Last update 00:00:08 ago * 192.168.99.33, via dummy1, weight 1 * 192.168.100.33, via dummy2, weight 1 * 192.168.101.33, via dummy3, weight 1 * 192.168.102.33, via dummy4, weight 1 eva(config)# do show ip route nexthop-group 9.9.9.9/32 % Unknown command: do show ip route nexthop-group 9.9.9.9/32 eva(config)# do show ip route 9.9.9.9/32 nexthop-group Routing entry for 9.9.9.9/32 Known via "static", distance 1, metric 0, best Last update 00:00:54 ago Nexthop Group ID: 57 * 192.168.99.33, via dummy1, weight 1 * 192.168.100.33, via dummy2, weight 1 * 192.168.101.33, via dummy3, weight 1 * 192.168.102.33, via dummy4, weight 1 eva(config)# exit eva# conf eva(config)# int dummy3 eva(config-if)# shut eva(config-if)# no shut eva(config-if)# do show ip route 9.9.9.9/32 nexthop-group Routing entry for 9.9.9.9/32 Known via "static", distance 1, metric 0, best Last update 00:00:08 ago Nexthop Group ID: 57 * 192.168.99.33, via dummy1, weight 1 * 192.168.100.33, via dummy2, weight 1 * 192.168.101.33, via dummy3, weight 1 * 192.168.102.33, via dummy4, weight 1 eva(config-if)# exit eva(config)# exit eva# exit sharpd@eva ~/frr1 (master) [255]> ip nexthop show id 57 id 57 group 37/43/50/58 proto zebra sharpd@eva ~/frr1 (master)> ip route show 9.9.9.9/32 9.9.9.9 nhid 57 proto 196 metric 20 nexthop via 192.168.99.33 dev dummy1 weight 1 nexthop via 192.168.100.33 dev dummy2 weight 1 nexthop via 192.168.101.33 dev dummy3 weight 1 nexthop via 192.168.102.33 dev dummy4 weight 1 sharpd@eva ~/frr1 (master)> Notice that we now no longer are creating a bunch of new nexthop groups. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
| * zebra: Reinstall nexthop when interface comes back upDonald Sharp2024-09-161-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a interface down event caused a nexthop group to remove one of the entries in the kernel, have it be reinstalled when the interface comes back up. Mark the nexthop as usable. new behavior: eva# show nexthop-group rib 181818168 ID: 181818168 (sharp) RefCnt: 1 Uptime: 00:00:23 VRF: default(bad-value) Valid, Installed Depends: (35) (38) (44) (51) via 192.168.99.33, dummy1 (vrf default), weight 1 via 192.168.100.33, dummy2 (vrf default), weight 1 via 192.168.101.33, dummy3 (vrf default), weight 1 via 192.168.102.33, dummy4 (vrf default), weight 1 eva# conf eva(config)# int dummy3 eva(config-if)# shut eva(config-if)# do show nexthop-group rib 181818168 ID: 181818168 (sharp) RefCnt: 1 Uptime: 00:00:44 VRF: default(bad-value) Depends: (35) (38) (44) (51) via 192.168.99.33, dummy1 (vrf default), weight 1 via 192.168.100.33, dummy2 (vrf default), weight 1 via 192.168.101.33, dummy3 (vrf default) inactive, weight 1 via 192.168.102.33, dummy4 (vrf default), weight 1 eva(config-if)# no shut eva(config-if)# do show nexthop-group rib 181818168 ID: 181818168 (sharp) RefCnt: 1 Uptime: 00:00:53 VRF: default(bad-value) Valid, Installed Depends: (35) (38) (44) (51) via 192.168.99.33, dummy1 (vrf default), weight 1 via 192.168.100.33, dummy2 (vrf default), weight 1 via 192.168.101.33, dummy3 (vrf default), weight 1 via 192.168.102.33, dummy4 (vrf default), weight 1 eva(config-if)# exit eva(config)# exit eva# exit sharpd@eva ~/frr1 (master) [255]> ip nexthop show id 181818168 id 181818168 group 35/38/44/51 proto 194 sharpd@eva ~/frr1 (master)> Signed-off-by: Donald Sharp <sharpd@nvidia.com>
| * zebra: Expose _route_entry_dump_nh so it can be used.Donald Sharp2024-09-162-5/+8
| | | | | | | | | | | | Expose this helper function so it can be used in zebra_nhg.c Signed-off-by: Donald Sharp <sharpd@nvidia.com>
| * zebra: Properly note that a nhg's nexthop has gone downDonald Sharp2024-09-161-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code when a link is set down is to just mark the nexthop group as not properly setup. Leaving situations where when an interface goes down and show output is entered we see incorrect state. This is true for anything that would be checking those flags at that point in time. Modify the interface down nexthop group code to notice the nexthops appropriately ( and I mean set the appropriate flags ) and to allow a `show ip route` command to actually display what is going on with the nexthops. eva# show ip route 1.0.0.0 Routing entry for 1.0.0.0/32 Known via "sharp", distance 150, metric 0, best Last update 00:00:06 ago * 192.168.44.33, via dummy1, weight 1 * 192.168.45.33, via dummy2, weight 1 sharpd@eva:~/frr1$ sudo ip link set dummy2 down eva# show ip route 1.0.0.0 Routing entry for 1.0.0.0/32 Known via "sharp", distance 150, metric 0, best Last update 00:00:12 ago * 192.168.44.33, via dummy1, weight 1 192.168.45.33, via dummy2 inactive, weight 1 Notice now that the 1.0.0.0/32 route now correctly displays the route for the nexthop group entry. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Fix crash during reconnectIgor Zhukov2024-10-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fpm_enqueue_rmac_table expects an fpm_rmac_arg* as its argument. The issue can be reproduced by dropping the TCP session using: ss -K dst 127.0.0.1 dport = 2620 I used Fedora 40 and frr 9.1.2 and I got the gdb backtrace: (gdb) bt 0 0x00007fdd7d6997ea in fpm_enqueue_rmac_table (bucket=0x2134dd0, arg=0x2132b60) at zebra/dplane_fpm_nl.c:1217 1 0x00007fdd7dd1560d in hash_iterate (hash=0x21335f0, func=0x7fdd7d6997a0 <fpm_enqueue_rmac_table>, arg=0x2132b60) at lib/hash.c:252 2 0x00007fdd7dd1560d in hash_iterate (hash=0x1e5bf10, func=func@entry=0x7fdd7d698900 <fpm_enqueue_l3vni_table>, arg=arg@entry=0x7ffed983bef0) at lib/hash.c:252 3 0x00007fdd7d698b5c in fpm_rmac_send (t=<optimized out>) at zebra/dplane_fpm_nl.c:1262 4 0x00007fdd7dd6ce22 in event_call (thread=thread@entry=0x7ffed983c010) at lib/event.c:1970 5 0x00007fdd7dd20758 in frr_run (master=0x1d27f10) at lib/libfrr.c:1213 6 0x0000000000425588 in main (argc=10, argv=0x7ffed983c2e8) at zebra/main.c:492 Signed-off-by: Igor Zhukov <fsb4000@yandex.ru>
* | zebra: Add missing proto translationsDonald Sharp2024-09-251-0/+4
| | | | | | | | | | | | Add missing isis and eigrp proto translations. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Correctly report metricsDonald Sharp2024-09-251-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Report the routes metric in IPFORWARDMETRIC1 and return -1 for the other metrics as required by the IP-FORWARD-MIB. inetCidrRouteMetric2 OBJECT-TYPE SYNTAX Integer32 MAX-ACCESS read-create STATUS current DESCRIPTION "An alternate routing metric for this route. The semantics of this metric are determined by the routing- protocol specified in the route's inetCidrRouteProto value. If this metric is not used, its value should be set to -1." DEFVAL { -1 } ::= { inetCidrRouteEntry 13 } I've included metric2 but it's the same for all of them. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Let's use memset instead of walking bytes and setting to 0Donald Sharp2024-09-251-8/+2
| | | | | | | | Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Fix snmp walk of zebra ribDonald Sharp2024-09-251-2/+4
| | | | | | | | | | | | | | | | | | The snmp walk of the zebra rib was skipping entries because in_addr_cmp was replaced with a prefix_cmp which worked slightly differently causing parts of the zebra rib tree to be skipped. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | lib, zebra: TABLE_NODE is not usedDonald Sharp2024-09-241-15/+0
| | | | | | | | | | | | No-one is using this, remove Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | Merge pull request #16882 from mjstapp/fix_if_table_unlockDonatas Abraitis2024-09-232-2/+9
|\ \ | | | | | | zebra: unlock if_table route_nodes
| * | zebra: unlock if_table route_nodesMark Stapp2024-09-202-2/+9
| | | | | | | | | | | | | | | | | | | | | Must unlock if we break during iteration over any lib/table tree. Signed-off-by: Mark Stapp <mjs@cisco.com>
* | | zebra: Pass in ZEBRA_ROUTE_MAX instead of trueDonald Sharp2024-09-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | zebra_nhg_install_kernel takes a route type. We don't know it at that particular spot but we should not be passing in `true`. Let's use ZEBRA_ROUTE_MAX to indicate we do not know, so that the correct thing is done. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | | zebra: Send a correct size of ctx->nh6 for SRv6 SEG6_LOCAL_ACTION_END_DX6Donatas Abraitis2024-09-191-2/+2
| | | | | | | | | | | | | | | | | | Fixes: f6e58d26f638d0bcdc34dfc5890669036a0129df ("zebra, sharpd: add srv6 End.DX6 support") Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
* | | zebra: Remove nl_addraw_lDonald Sharp2024-09-192-23/+0
| | | | | | | | | | | | | | | | | | This function is never used. So let's remove it. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | | zebra: In zebra_evpn_mac.c remove bad commentsDonald Sharp2024-09-181-31/+15
| | | | | | | | | | | | | | | | | | | | | | | | Adding comments that tell what a variable is doing in the middle of a function call makes it extremely hard to read the formatting. Remove. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | | zebra: Reindent some badly formatted functions in zebra_evpn_mac.cDonald Sharp2024-09-181-61/+63
| | | | | | | | | | | | | | | | | | Fix some badly formatted code to fit better on the screen. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | | zebra: Reframe zebra_evpn_mac.c to be properly formattedDonald Sharp2024-09-181-337/+440
|/ / | | | | | | Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* / zebra: include the prefix in nht show commandEnke Chen2024-09-151-4/+8
|/ | | | | | Include the prefix in "show ip nht" and "show ipv6 nht". Signed-off-by: Enke Chen <enchen@paloaltonetworks.com>
* zebra: Add more vrf name to debugsDonald Sharp2024-09-114-61/+75
| | | | | | | | | Trying to debug some cross vrf stuff in zebra and frankly it's hard to grep the file for the routes you are interested in. Let's clean this up some and get a bit better information for us developers Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* Merge pull request #15259 from dmytroshytyi-6WIND/nexthop_resolutionRuss White2024-09-109-32/+219
|\ | | | | zebra: add LSP entry to nexthop via recursive (part 2)
| * zebra: return void zebra_mpls_lsp_installDmytro Shytyi2024-06-072-9/+6
| | | | | | | | | | | | | | | | zebra_mpls_lsp_install() returned integer is never checked. Return void instead. Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
| * zebra: update nhlfe when the outgoing labels differPhilippe Guibert2024-06-071-1/+26
| | | | | | | | | | | | | | | | | | Because the nhlfe label stack may contain more than one label, ensure to copy all labels. Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
| * zebra: the header containing the re->flags is in zclient.hPhilippe Guibert2024-06-071-2/+2
| | | | | | | | | | | | | | | | Change the comment in the code that refers to ZEBRA_FLAG_XXX defines. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
| * zebra: relax mpls entries installation with labeled unicast entriesPhilippe Guibert2024-06-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, when a FEC entry is added in zebra, driven by the reception of a BGP labeled unicast update, an LSP entry is created. That LSP entry is resolved by using the route entry which is also installed by BGP labeled unicast. This route entry is not available when we face with i-bgp peering session. I-BGP labeled sessions are used to establish IP connectivity across separate IGPs. The below dumps illustrate a 3 IGP topology. An attempt to create connectivity between the north and the south machines is done. The 3 separate IGPs are configured in Segment routing: - north-east - east-west - west-south We create BGP peerings between each endpoint of each IGP: - iBGP between (north) and (east) - iBGP between (east) and (west) - iBGP between (west) and (south) Before that patch, the FEC entries could not be resolved on the east machine: Before: east-vm# show mpls fec 192.0.2.1/32 Label: 18 Client list: bgp(fd 48) 192.0.2.5/32 Label: 17 Client list: bgp(fd 48) 192.0.2.7/32 Label: 19 Client list: bgp(fd 48) east-vm# show mpls table Inbound Label Type Nexthop Outbound Label -------------------------------------------------------- 1011 SR (OSPF) 192.168.2.2 1011 1022 SR (OSPF) 192.168.2.2 implicit-null 11044 SR (IS-IS) 192.168.3.4 implicit-null 11055 SR (IS-IS) 192.168.3.4 11055 30000 SR (OSPF) 192.168.2.2 implicit-null 30001 SR (OSPF) 192.168.2.2 implicit-null 36000 SR (IS-IS) 192.168.3.4 implicit-null east-vm# show ip route [..] B 192.0.2.1/32 [200/0] via 192.0.2.1 inactive, label implicit-null, weight 1, 00:17:45 O>* 192.0.2.1/32 [110/20] via 192.168.2.2, r3-eth0, label 1011, weight 1, 00:17:47 O>* 192.0.2.2/32 [110/10] via 192.168.2.2, r3-eth0, label implicit-null, weight 1, 00:17:47 O 192.0.2.3/32 [110/0] is directly connected, lo, weight 1, 00:17:57 C>* 192.0.2.3/32 is directly connected, lo, 00:18:03 I>* 192.0.2.4/32 [115/20] via 192.168.3.4, r3-eth1, label implicit-null, weight 1, 00:17:59 B 192.0.2.5/32 [200/0] via 192.0.2.5 inactive, label implicit-null, weight 1, 00:17:56 I>* 192.0.2.5/32 [115/30] via 192.168.3.4, r3-eth1, label 11055, weight 1, 00:17:58 B> 192.0.2.7/32 [200/0] via 192.0.2.5 (recursive), label 19, weight 1, 00:17:45 * via 192.168.3.4, r3-eth1, label 11055/19, weight 1, 00:17:45 [..] After command "mpls fec nexthop-resolution" was applied, the FEC entries will resolve over any non BGP route that has a labeled path selected. east-vm# show mpls table Inbound Label Type Nexthop Outbound Label -------------------------------------------------------- 17 SR (IS-IS) 192.168.3.4 11055 18 SR (OSPF) 192.168.2.2 1011 19 BGP 192.168.3.4 11055/19 1011 SR (OSPF) 192.168.2.2 1011 1022 SR (OSPF) 192.168.2.2 implicit-null 11044 SR (IS-IS) 192.168.3.4 implicit-null 11055 SR (IS-IS) 192.168.3.4 11055 30000 SR (OSPF) 192.168.2.2 implicit-null 30001 SR (OSPF) 192.168.2.2 implicit-null 36000 SR (IS-IS) 192.168.3.4 implicit-null Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com> Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
| * zebra: when adding an LSP, update the log message with multiple labelDmytro Shytyi2024-06-071-4/+11
| | | | | | | | | | | | | | | | | | | | When an LSP entry is created from a FEC entry, multiple labels may now be appended to the LSP entry, instead of one single. Upon lsp creation, the LSP trace will display all the labels appended. Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
| * zebra, lib: upon lsp install, iterate nexthop accordinglyPhilippe Guibert2024-06-071-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two ways of iterating over nexthops of a given route entry. - Either only the main nexthop are taken into account (which is the case today when attempting to install an LSP entry on a BGP connected labeled route. - Or by taking into account nexthops that are resolved and linked in nexthop->resolved of the previous nexthop which has RECURSIVE flag set. This second case has to be taken into account in the case where recursive routes may be used to install an LSP entry. Introduce a new API in nexthop that will parse over the appropriate nexthop, if the nexthop-resolution flag is turned on or not on the given VRF. Use that API in the lsp_install() function so as to walk over the appropriate nexthops. Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
| * zebra: handle nexthop-resolution updatesPhilippe Guibert2024-06-071-9/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upon reconfiguring nexthop-resolution updates, update the lsp entries accordingly. If fec nexthop-resolution becomes true, then call again fec_change_update_lsp() for each fec entry available. If fec nexthop-resolution becomes false, then call again fec_change_update_lsp() for each fec entry available, and if the update fails, uninstall any lsp related with the fec entry. In the case lsp_install() and no lsp entry could be created or updated, then consider this call as a failure, and return -1. Co-developed-by: Dmytro Shytyi <dmytro.shytyi@6wind.com> Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
| * zebra: add 'mpls fec nexthop-resolution' command to vtyshDmytro Shytyi2024-06-077-0/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commands added: r3# configure r3(config)# mpls fec MPLS FEC table label Label configuration ldp Label Distribution Protocol lsp Establish label switched path r3(config)# mpls fec mpls fec nexthop-resolution Authorise nexthop resolution over all labeled routes. r3(config)# mpls fec mpls fec nexthop-resolution r3# configure r3(config)# vrf default r3(config-vrf)# mpls fec MPLS FEC table r3(config-vrf)# mpls fec mpls fec nexthop-resolution Authorise nexthop resolution over all labeled routes. r3(config-vrf)# mpls fec nexthop-resolution east-vm# show running-config Building configuration... ... ! mpls fec nexthop-resolution ! ... Signed-off-by: Dmytro Shytyi <dmytro.shytyi@6wind.com>
* | zebra: Modify show `zebra dplane providers` to give more dataDonald Sharp2024-09-053-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The show zebra dplane provider command was ommitting the input and output queues to the dplane itself. It would be nice to have this insight as well. New output: r1# show zebra dplane providers dataplane Incoming Queue from Zebra: 100 Zebra dataplane providers: Kernel (1): in: 6, q: 0, q_max: 3, out: 6, q: 14, q_max: 3 dplane_fpm_nl (2): in: 6, q: 10, q_max: 3, out: 6, q: 0, q_max: 3 dataplane Outgoing Queue to Zebra: 43 r1# Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Limit queue depth in dplane_fpm_nlDonald Sharp2024-09-051-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The dplane providers have a concept of input queues and output queues. These queues are chained together during normal operation. The code in zebra also has a feedback mechanism where the MetaQ will not run when the first input queue is backed up. Having the dplane_fpm_nl code grab all contexts when it is backed up prevents this system from behaving appropriately. Modify the code to not add to the dplane_fpm_nl's internal queue when it is already full. This will allow the backpressure to work appropriately in zebra proper. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Modify dplane loop to allow backpressure to filter upDonald Sharp2024-09-051-20/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when the dplane_thread_loop is run, it moves contexts from the dg_update_list and puts the contexts on the input queue of the first provider. This provider is given a chance to run and then the items on the output queue are pulled off and placed on the input queue of the next provider. Rinse/Repeat down through the entire list of providers. Now imagine that we have a list of multiple providers and the last provider is getting backed up. Contexts will end up sticking in the input Queue of the `slow` provider. This can grow without bounds. This is a real problem when you have a situation where an interface is flapping and an upper level protocol is sending a continous stream of route updates to reflect the change in ecmp. You can end up with a very very large backlog of contexts. This is bad because zebra can easily grow to a very very large memory size and on restricted systems you can run out of memory. Fortunately for us, the MetaQ already participates with this process by not doing more route processing until the dg_update_list goes below the working limit of dg_updates_per_cycle. Thus if FRR modifies the behavior of this loop to not move more contexts onto the input queue if either the input queue or output queue of the next provider has reached this limit. FRR will naturaly start auto handling backpressure for the dplane context system and memory will not go out of control. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Use the ctx queue countersDonald Sharp2024-09-051-25/+8
| | | | | | | | | | | | | | The ctx queue data structures already have a counter associated with them. Let's just use them instead. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | *: Create termtable specific temp memoryDonald Sharp2024-09-013-3/+3
| | | | | | | | | | | | | | | | | | | | When trying to track down a MTYPE_TMP memory leak it's harder to search for it when you happen to have some usage of ttable_dump. Let's just give it it's own memory type so that we can avoid confusion in the future. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | zebra: Allow for initial deny of installation of nhe'sDonald Sharp2024-08-305-14/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the FRR code will receive both kernel and connected routes that do not actually have an underlying nexthop group at all. Zebra turns around and creates a `matching` nexthop hash entry and installs it. For connected routes, this will create 2 singleton nexthops in the dplane per interface (v4 and v6). For kernel routes it would just create 1 singleton nexthop that might be used or not. This is bad because the dplane has a limited amount of space available for nexthop entries and if you happen to have a large number of interfaces then all of a sudden you have 2x(# of interfaces) singleton nexthops. Let's modify the code to delay creation of these singleton nexthops until they have been used by something else in the system. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | Merge pull request #16689 from donaldsharp/blackhole_and_afiJafar Al-Gharaibeh2024-08-302-7/+14
|\ \ | | | | | | Blackhole and afi
| * | zebra: Allow blackhole singleton nexthops to be v6Donald Sharp2024-08-291-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A blackhole nexthop, according to the linux kernel, can be v4 or v6. A v4 blackhole nexthop cannot be used on a v6 route, but a v6 blackhole nexthop can be used with a v4 route. Convert all blackhole singleton nexthops to v6 and just use that. Possibly reducing the number of active nexthops by 1. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
| * | zebra: Display afi of the nexthop hash entryDonald Sharp2024-08-291-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Let's display the afi of the nexthop hash entry. Right now it is impossible to tell the difference between v4 or v6 nexthops, especially since it is important for the kernel. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* | | zebra: Convince SA that the ng will always be validDonald Sharp2024-08-301-1/+2
|/ / | | | | | | | | | | | | | | | | There is a code path that could theoretically get you to a point where the ng->nexthop is a NULL value. Let's just make sure the SA system believes that cannot happen anymore. Signed-off-by: Donald Sharp <sharpd@nvidia.com>