| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| | |
bgpd: add meta queue in bgp
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit introduces meta queue to the BGP process_queue which is
helpful in having a priority of lists where some routes can be processed
earlier than 'other' routes. This is similar to how meta queue is
present in zebra.
After Fix:
---------
For testing, note that all 100.x routes are marked as Early routes which
got enqueued and dequeued first before Other routes in every batch of
updates. Also, the items are dequeued in FIFO order.
switch# cat /var/log/frr/bgpd.log | grep sub-queue
2024/12/06 19:19:42.788014 BGP: [V64FH-G6883] 88.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:42.856127 BGP: [V64FH-G6883] 100.90.9.186/32 queued into sub-queue Early Route
2024/12/06 19:19:42.856138 BGP: [V64FH-G6883] 100.90.9.187/32 queued into sub-queue Early Route
2024/12/06 19:19:42.886715 BGP: [V64FH-G6883] 66.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.022835 BGP: [V64FH-G6883] 33.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.058842 BGP: [V64FH-G6883] 44.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.092365 BGP: [V64FH-G6883] 55.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.540770 BGP: [ZAPXS-9754G] 100.90.9.186/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.541233 BGP: [ZAPXS-9754G] 100.90.9.187/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.541523 BGP: [ZAPXS-9754G] 88.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:43.602094 BGP: [V64FH-G6883] 88.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.649083 BGP: [V64FH-G6883] 100.90.9.186/32 queued into sub-queue Early Route
2024/12/06 19:19:43.649092 BGP: [V64FH-G6883] 100.90.9.187/32 queued into sub-queue Early Route
2024/12/06 19:19:43.649148 BGP: [V64FH-G6883] 77.0.0.9/32 queued into sub-queue Other Route
2024/12/06 19:19:43.712282 BGP: [V64FH-G6883] 100.90.9.138/32 queued into sub-queue Early Route
2024/12/06 19:19:43.712314 BGP: [V64FH-G6883] 100.90.9.139/32 queued into sub-queue Early Route
2024/12/06 19:19:43.817194 BGP: [V64FH-G6883] 100.90.8.58/32 queued into sub-queue Early Route
2024/12/06 19:19:43.817205 BGP: [V64FH-G6883] 100.90.8.59/32 queued into sub-queue Early Route
2024/12/06 19:19:43.942464 BGP: [ZAPXS-9754G] 100.90.9.186/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942530 BGP: [ZAPXS-9754G] 100.90.9.187/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942550 BGP: [ZAPXS-9754G] 100.90.9.138/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942738 BGP: [ZAPXS-9754G] 100.90.9.139/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942763 BGP: [ZAPXS-9754G] 100.90.8.58/32 dequeued from sub-queue Early Route
2024/12/06 19:19:43.942788 BGP: [ZAPXS-9754G] 100.90.8.59/32 dequeued from sub-queue Early Route
2024/12/06 19:19:44.558611 BGP: [ZAPXS-9754G] 66.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:44.893541 BGP: [ZAPXS-9754G] 33.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:45.171794 BGP: [ZAPXS-9754G] 44.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:45.453137 BGP: [ZAPXS-9754G] 55.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:45.685269 BGP: [ZAPXS-9754G] 88.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 19:19:45.764752 BGP: [ZAPXS-9754G] 77.0.0.9/32 dequeued from sub-queue Other Route
With 'update-delay' feature (EOIU marker):
------------------------------------------
switch# vtysh -c "show run bgp" | grep update-delay
update-delay 40
switch# cat /var/log/frr/bgpd.log | grep sub-queue
2024/12/06 23:27:46.124461 BGP: [V64FH-G6883] 22.0.0.9/32 queued into sub-queue Other Route
2024/12/06 23:27:46.160224 BGP: [V64FH-G6883] 100.90.8.11/32 queued into sub-queue Early Route
2024/12/06 23:27:46.219663 BGP: [W9QTR-P4REP] EOIU Marker queued into sub-queue EOIU Marker
2024/12/06 23:27:46.269711 BGP: [ZAPXS-9754G] 100.90.8.11/32 dequeued from sub-queue Early Route
2024/12/06 23:27:46.270980 BGP: [ZAPXS-9754G] 22.0.0.9/32 dequeued from sub-queue Other Route
2024/12/06 23:27:46.404868 BGP: [RBX2V-K33CZ] EOIU Marker dequeued from sub-queue EOIU Markera
Ticket: #4200787
Signed-off-by: Karthikeya Venkat Muppalla <kmuppalla@nvidia.com>
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|/
|
|
|
|
| |
Remove a printfrr registration for pRN from bgpd.
Signed-off-by: Mark Stapp <mjs@cisco.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current changes deals with EVPN routes installation to zebra.
In evpn_route_select_install() we invoke evpn_zebra_install/uninstall
which sends zclient_send_message().
This is a continuation of code changes (similar to
ccfe452763d16c432fa81fd20e805bec819b345e) but to handle evpn part
of the code.
Ticket: #3390099
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BGP is now keeping a list of dests with the dest having a pointer
to the bgp_path_info that it will be working on.
1) When bgp receives a prefix, process it, add the bgp_dest of the
prefix into the new Fifo list if not present, update the flags (Ex:
earlier if the prefix was advertised and now it is a withdrawn),
increment the ref_count and DO NOT advertise the install/withdraw
to zebra yet.
2) Schedule an event to wake up to invoke the new function which will
walk the list one by one and installs/withdraws the routes into zebra.
a) if BUFFER_EMPTY, process the next item on the list
b) if BUFFER_PENDING, bail out and the callback in
zclient_flush_data() will invoke the same function when BUFFER_EMPTY
Changes
- rename old bgp_zebra_announce to bgp_zebra_announce_actual
- rename old bgp_zebra_withdrw to bgp_zebra_withdraw_actual
- Handle new fifo list cleanup in bgp_exit()
- New funcs: bgp_handle_route_announcements_to_zebra() and
bgp_zebra_route_install()
- Define a callback function to invoke
bgp_handle_route_announcements_to_zebra() when BUFFER_EMPTY in
zclient_flush_data()
The current change deals with bgp installing routes via
bgp_process_main_one()
Ticket: #3390099
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modify the bgp master to hold a type safe list for bgp_dests that need
to be passed to zebra.
Future commits will use this.
Ticket: #3390099
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Rajasekar Raja <rajasekarr@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
dest will not be freed due to lock but coverity does not know
that. Give it a hint. This change includes modifying bgp_dest_unlock_node
to return the dest pointer so that we can determine if we should
continue working on dest or not with an assert. Since this
is lock based we should be ok.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is based on @donaldsharp's work
The current code base is the struct bgp_node data structure.
The problem with this is that it creates a bunch of
extra data per route_node.
The table structure generates ‘holder’ nodes
that are never going to receive bgp routes,
and now the memory of those nodes is allocated
as if they are a full bgp_node.
After splitting up the bgp_node into bgp_dest and route_node,
the memory of ‘holder’ node which does not have any bgp data
will be allocated as the route_node, not the bgp_node,
and the memory usage is reduced.
The memory usage of BGP node will be reduced from 200B to 96B.
The total memory usage optimization of this part is ~16.00%.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Signed-off-by: Yuqing Zhao <xiaopanghu99@163.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the scenario of evpn, the box has some type-5 ECMP routes.
After one of its remote peers is removed ( or down ), `show evpn rmac vni all`
kept no change **sometimes**, it means the rmac of the removed peer maybe is
still in this box, and the traffic will be wrongly forwarded to the removed
peer.
The root cause is that the best path selection for type-5 routes maybe
keep no change and the best path is not routed to the removed peer, so `bgpd`
wrongly doesn't tell `zebra` to remove ( withdraw ) the type-5 routes owned
by the removed peer.
So, add a new flag to force the deletion.
Signed-off-by: anlan_cs <vic.lan@pica8.com>
|
|
|
|
|
|
|
|
|
| |
Effectively a massive search and replace of
`struct thread` to `struct event`. Using the
term `thread` gives people the thought that
this event system is a pthread when it is not
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
| |
Done with a combination of regex'ing and banging my head against a wall.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
|
|
|
|
|
|
| |
https://www.rfc-editor.org/rfc/rfc7311.html
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
|
|
|
|
|
|
| |
TL;DR: rfc7611.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bgp_node_match_ipv4
bgp_node_match_ipv6
bgp_table_iter_init
bgp_table_iter_next
bgp_table_iter_cleanup
bgp_table_iter_pause
bgp_table_iter_is_done
bgp_table_iter_started
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tested with both UDST and LTTng, both are OK.
```
[13:57:31.346131253] (+?.?????????) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "10.0.2.0/24", count = 3 }
[13:57:31.346154756] (+0.000023503) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "192.168.0.0/24", count = 3 }
[13:57:31.346156699] (+0.000001943) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "192.168.10.0/24", count = 2 }
[13:57:31.346157570] (+0.000000871) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "192.168.100.1/32", count = 2 }
[13:57:31.346158521] (+0.000000951) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "192.168.100.2/32", count = 2 }
[13:57:31.356149109] (+0.009990588) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "10.0.2.0/24", count = 3 }
[13:57:31.356155889] (+0.000006780) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "192.168.0.0/24", count = 3 }
[13:57:31.356156840] (+0.000000951) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "192.168.10.0/24", count = 2 }
[13:57:31.356157751] (+0.000000911) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "192.168.100.1/32", count = 2 }
[13:57:31.356158683] (+0.000000932) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 1 }, { prefix = "192.168.100.2/32", count = 2 }
[13:57:34.508252238] (+3.152093555) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "172.16.16.1/32", count = 2 }
[13:57:34.508289549] (+0.000037311) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "172.16.16.2/32", count = 2 }
[13:57:34.508307544] (+0.000017995) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "172.16.16.3/32", count = 2 }
[13:57:34.508433878] (+0.000126334) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "10.0.2.0/24", count = 2 }
[13:57:34.508435891] (+0.000002013) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "10.0.2.0/24", count = 2 }
[13:57:34.508458182] (+0.000022291) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "192.168.0.0/24", count = 2 }
[13:57:34.508458852] (+0.000000670) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "192.168.0.0/24", count = 2 }
[13:57:34.508472821] (+0.000013969) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "192.168.10.0/24", count = 1 }
[13:57:34.508473482] (+0.000000661) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "192.168.10.0/24", count = 1 }
[13:57:34.508487041] (+0.000013559) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "192.168.100.1/32", count = 1 }
[13:57:34.508487792] (+0.000000751) home-spine1.donatas.net frr_bgp:bgp_dest_lock: { cpu_id = 2 }, { prefix = "192.168.100.1/32", count = 1 }
```
Converting bgp_dest_lock_node/bgp_dest_unlock_node to non-inlined function
because LTTng can't work properly with inlined and the compiler does not like
it.
Not sure how it would be with the performance, but let's see.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
|
|\
| |
| | |
bgpd: split soft reconfig table task into several jobs to not block vtysh
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
BGP configuration changes that imply recomputing the BGP route table
(e.g. modifying route-maps, setting bgp graceful-shutdown) might be a
long time process depending on the size of the BGP table and the
route-map numbers and complexity. For example, setups with full
Internet routes take something like one minute to reprocess all the
prefixes when graceful-shutdown is configured. During this time, a
"show bgp commands" request on vtysh results in blocking the shell until
the soft reconfigure table task is over.
This patch splits bgp_soft_reconfig_table task into thread jobs of 25K
prefixes.
Some tests on a full Internet route setup show that after reconfiguring
route-maps or graceful-shutdown, vtysh is not stucked anymore. We are
now able to request commands like "show bgp summary" after 1 or 2
seconds instead of 30 to 60s.
Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
static inline functions cannot be probed, let's add USDT.
This will help catching memory leaks regarding lock/unlock in the future,
I believe.
```
global locks;
probe begin
{
ansi_clear_screen();
println("Starting...");
}
probe process("/usr/lib/frr/bgpd").mark("bgp_dest_lock")
{
prefix = @cast($arg1, "bgp_node")->p->u->val;
locks[prefix]++;
printf("---> %s %d (lock count: %d)\n", probefunc(),
prefix, @cast($arg1, "bgp_node")->lock);
}
probe process("/usr/lib/frr/bgpd").mark("bgp_dest_unlock")
{
prefix = @cast($arg1, "bgp_node")->p->u->val;
if (locks[prefix] > 0) {
locks[prefix]--;
}
printf("<--- %s %d (lock count: %d)\n", probefunc(),
prefix, @cast($arg1, "bgp_node")->lock);
}
```
Some outputs:
```
<--- adj_free 94283113939304 (lock count: 2)
<--- adj_free 94283114099240 (lock count: 3)
<--- adj_free 94283114099752 (lock count: 3)
---> bgp_clear_route_table 94283113936008 (lock count: 4)
---> bgp_clear_route_table 94283113932664 (lock count: 4)
---> bgp_clear_route_table 94283114099240 (lock count: 3)
---> bgp_clear_route_table 94283114099752 (lock count: 3)
<--- adj_free 94283113795608 (lock count: 2)
<--- adj_free 94283113797112 (lock count: 2)
<--- adj_free 94283113796456 (lock count: 2)
---> bgp_clear_route_table 94283113936008 (lock count: 5)
---> bgp_clear_route_table 94283113932664 (lock count: 5)
---> bgp_clear_route_table 94283114099240 (lock count: 4)
---> bgp_clear_route_table 94283114099752 (lock count: 4)
---> bgp_process 94283113936008 (lock count: 5)
<--- bgp_clear_node_queue_del 94283113936008 (lock count: 6)
---> bgp_process 94283113932664 (lock count: 5)
<--- bgp_clear_node_queue_del 94283113932664 (lock count: 6)
---> bgp_process 94283114099240 (lock count: 4)
<--- bgp_clear_node_queue_del 94283114099240 (lock count: 5)
---> bgp_process 94283114099752 (lock count: 4)
<--- bgp_clear_node_queue_del 94283114099752 (lock count: 5)
<--- bgp_clear_node_queue_del 94283113936008 (lock count: 5)
<--- bgp_clear_node_queue_del 94283113932664 (lock count: 5)
<--- bgp_clear_node_queue_del 94283114099240 (lock count: 4)
<--- bgp_clear_node_queue_del 94283114099752 (lock count: 4)
<--- bgp_path_info_reap 94283113936008 (lock count: 4)
<--- bgp_path_info_reap 94283113936008 (lock count: 3)
```
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To prepare for fixing an issue where labels do not get released back
to the labelpool when the route is deleted some refactoring is
necessary. There are 2 parts to this.
1. restructure the code to remove the circular nature of label
allocations via the labelpool and decouple the label type decision
from the notification fo the FEC.
The code to notify the FEC association to zebra has been split out
into a separate function so that it can be called from the synchronous
path (for registration of index-based labels and de-registration of all
labels), and from the asynchronous path where we need to wait for a
callback from the labelpool code with a label allocation.
The decision about whether we are using an index-based label or an
allocated label is reflected in the state of the BGP_NODE_LABEL_REQUESTED
flag so the checks on the path_info in the labelpool callback code are
no longer required.
2. change the owned of a labelpool allocated label from the path info
structure to the bgp_dest structure. This allows labels to be released
(in a subsequent commit) when the owner (bgp_dest) goes away.
Signed-off-by: Pat Ruddy <pat@voltanet.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Process FIB update in bgp_zebra_route_notify_owner() and call
group_announce_route() if route is installed
* When bgp update is received for a route which is not installed earlier
(flag BGP_NODE_FIB_INSTALLED is not set) and suppress fib is enabled
set the flag BGP_NODE_FIB_INSTALL_PENDING to indicate fib install is
pending for the route. The route will be advertised when zebra send
ZAPI_ROUTE_INSTALLED status.
* The advertisement delay (BGP_DEFAULT_UPDATE_ADVERTISEMENT_TIME)
is added to allow more routes to be sent in single update message.
This is required since zebra sends route notify message for each route.
The delay will be applied to update group timer which advertises
routes to peers.
Signed-off-by: kssoman <somanks@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
The `struct listnode *rt_node` data structure is adding
8 bytes of size to the `struct bgp_dest`. This is a large
amount of data for a flag we are already setting on each
node for this. Just set the flag and use that to figure
out who we are doing graceful restart on.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
|
| |
Create appropriate accessor functions for the rn->lock
data. We should be accessing this data through accessor
functions since it is private data to the data structure.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
| |
`%pRN` is not appropriate anymore.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a SYNC route i.e. a route with a local ES as destination is
rxed on a switch (say L11) from an ES peer (say L12) a local
MAC/neigh entry is created on L11 with the local access port
as dest port.
Creation of the local entry triggers a local path advertisement from
L11. This could be a "locally-active" path or a "locally-inactive"
path. Inactive paths are advertised with the proxy bit.
To ensure that the local entry is not deleted by a SYNC route it is
given absolute precedence over peer-paths.
If there are two non-local paths with the same dest ES and same MM
seq number the non-proxy path is preferred. This is done to ensure
that we don't lose track of the peer-activity.
Signed-off-by: Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
| |
This is the bulk part extracted from "bgpd: Convert from `struct
bgp_node` to `struct bgp_dest`". It should not result in any functional
change.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
|
|
|
|
|
|
|
| |
Announcements that are marked as invalid were previously not revalidated.
This was fixed by replacing the range lookup with a subtree lookup.
Signed-off-by: Marcel Röthke <marcel.roethke@haw-hamburg.de>
|
|
|
|
| |
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
| |
Add new function `bgp_node_get_prefix()` and modify
the bgp code base to use it.
This is prep work for the struct bgp_dest rework.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
| |
Tell the compiler that the prefix is being used for lookups
and it will never change.
Setup for future work.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
|
| |
* Selection Deferral Timer for Graceful Restart.
* Added selection deferral timer handling function.
* Route marking as selection defer when update message is received.
* Staggered processing of routes which are pending best selection.
* Fix for multi-path test case.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com>
|
|
|
|
|
|
|
|
|
| |
Store in bgp_node the reason why we choose a particular
best path over another. At this point we do not do
anything other than just store this data when we make
the decision. Future commits will display it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|\
| |
| | |
Bgp node continued
|
| |
| |
| |
| |
| |
| |
| |
| | |
The bgp_connected_set_node_info and bgp_connected_get_node_info
function names were slightly backwards lets fix them up
to bgp_node_set_bgp_connected_ref_info and bgp_node_get_bgp_connected_ref_info
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The bgp_distance_set_node_info and bgp_distance_get_node_info
function names were slightly backwards lets fix them up
to bgp_node_get_bgp_distance_info and bgp_node_set_bgp_distance_info
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The bgp_static_set_node_info and bgp_static_get_node_info
function names were slightly backwards rename to
bgp_node_get_bgp_static_info and bgp_node_set_bgp_static_info
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The bgp_aggregate_set_node_info and bgp_aggregate_get_node_info
functions names were slightly backwards, rename to
bgp_node_get_bgp_aggregate_info and bgp_node_set_bgp_aggregate_info
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
| |
| |
| |
| |
| |
| |
| | |
The bgp_nexthop_set_node_info and bgp_nexthop_get_node_info
function names were slightly backwards, rename to bgp_node_set and get
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
| |
| |
| |
| |
| |
| | |
Convert the set/get of bgp_table's from the info pointer.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The bgp_info data is stored as a void pointer in `struct bgp_node`.
Abstract retrieval of this data and setting of this data
into functions so that in the future we can move around
what is stored in bgp_node.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The ordering of data within the `struct bgp_node`
was causing extra padding of data. Moving the version
to a bit different spot allows for more efficient packing
of data.
Pre-change:
(gdb) p sizeof(struct bgp_node)
$1 = 152
(gdb)
Post-change:
(gdb) p sizeof(struct bgp_node)
$1 = 144
(gdb)
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|/
|
|
|
|
|
|
|
| |
The adj_out data structure is a linked list of adjacencies
1 per update group. In a large scale env where we are
not using peer groups, this list lookup starts to become
rather costly. Convert to a better data structure for this.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The motivation for this patch is to address a concerning behavior of
tx-addpath-bestpath-per-AS. Prior to this patch, all paths' TX ID was
pre-determined as the path was received from a peer. However, this meant
that any time the path selected as best from an AS changed, bgpd had no
choice but to withdraw the previous best path, and advertise the new
best-path under a new TX ID. This could cause significant network
disruption, especially for the subset of prefixes coming from only one
AS that were also communicated over a bestpath-per-AS session.
The patch's general approach is best illustrated by
txaddpath_update_ids. After a bestpath run (required for best-per-AS to
know what will and will not be sent as addpaths) ID numbers will be
stripped from paths that no longer need to be sent, and held in a pool.
Then, paths that will be sent as addpaths and do not already have ID
numbers will allocate new ID numbers, pulling first from that pool.
Finally, anything left in the pool will be returned to the allocator.
In order for this to work, ID numbers had to be split by strategy. The
tx-addpath-All strategy would keep every ID number "in use" constantly,
preventing IDs from being transferred to different paths. Rather than
create two variables for ID, this patch create a more generic array that
will easily enable more addpath strategies to be implemented. The
previously described ID manipulations will happen per addpath strategy,
and will only be run for strategies that are enabled on at least one
peer.
Finally, the ID numbers are allocated from an allocator that tracks per
AFI/SAFI/Addpath Strategy which IDs are in use. Though it would be very
improbable, there was the possibility with the free-running counter
approach for rollover to cause two paths on the same prefix to get
assigned the same TX ID. As remote as the possibility is, we prefer to
not leave it to chance.
This ID re-use method is not perfect. In some cases you could still get
withdraw-then-add behaviors where not strictly necessary. In the case of
bestpath-per-AS this requires one AS to advertise a prefix for the first
time, then a second AS withdraws that prefix, all within the space of an
already pending MRAI timer. In those situations a withdraw-then-add is
more forgivable, and fixing it would probably require a much more
significant effort, as IDs would need to be moved to ADVs instead of
paths.
Signed-off-by Mitchell Skiba <mskiba@amazon.com>
|
|
|
|
|
|
|
| |
Wrapper the get/set of the table->info pointer so that
people are not directly accessing this data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
| |
The bgp_nexthop_cache data is stored as a void pointer in `struct bgp_node`.
Abstract retrieval of this data and setting of this data
into functions so that in the future we can move around
what is stored in bgp_node.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
| |
The bgp_connected_ref data is stored as a void pointer in `struct bgp_node`.
Abstract retrieval of this data and setting of this data
into functions so that in the future we can move around
what is stored in bgp_node.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
| |
The bgp_static data is stored as a void pointer in `struct bgp_node`.
Abstract retrieval of this data and setting of this data
into functions so that in the future we can move around
what is stored in bgp_node.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
| |
The bgp_distance data is stored as a void pointer in `struct bgp_node`.
Abstract retrieval of this data and setting of this data
into functions so that in the future we can move around
what is stored in bgp_node.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
| |
The aggregate data is stored as a void pointer in `struct bgp_node`.
Abstract retrieval of this data and setting of this data
into functions so that in the future we can move around
what is stored in bgp_node.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
| |
Make the wart slightly less bad... also there is still a possible write
after free here. This needs to be fixed again, properly, by some
structure changes.
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bi->net pointer that is being unlocked had a commit
that removed the `bi->net = NULL;` recently. This code
was preventing a use after free crash being experienced
in other code paths. While commit 37e679629f9 was fixing
a different code path crash.
Make the parent->net pointer aware it may be locked/freed
from multiple places and to not NULL the pointer to it
unless we have actually freed the data.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|