| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
| |
Wrong: memset(&a, 0, sizeof(struct ...));
Good: memset(&a, 0, sizeof(a));
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
|
|
|
|
|
|
|
| |
To allow people to know the state of watchfrr from vtysh,
let's add a bit more data to the output.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When watchfrr has noticed issues, send operational state
to systemd so operators issuing `systemd status frr` can
see a more nuanced state of the daemon.
Add the `--operational-timeout X` value to the cli. After
the daemon has been restarted and communication re-established
wait this time before reporting to systemd that the daemon
is up and running.
Default value of 60 seconds was choosen to allow some small
delay in reporting so that, if the daemon is in a crash loop
status will not ping pong.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
| |
Align watchfrr with our coding standard
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
| |
This will align with our coding standards.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
| |
The int return value is never used. Modify the code
base to just return a void instead.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
| |
Naming functions/data structures more appropriately for
the project we are actually in.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since watchfrr invokes vtysh to gather the show run output and
write the data, if we are operating inside of a namespace FRR
must also pass this in.
Yes. This seems hacky. I don't fully understand why vtysh
is invoked this way.
New output:
sharpd@eva:~/frr3$ sudo vtysh -N one
Hello, this is FRRouting (version 8.1-dev).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
eva# wr mem
Note: this version of vtysh never writes vtysh.conf
% Can't open configuration file /etc/frr/one/vtysh.conf due to 'No such file or directory'.
Building Configuration...
Integrated configuration saved to /etc/frr/one/frr.conf
[OK]
eva#
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
|
| |
FRR should only ever use the appropriate THREAD_ON/THREAD_OFF
semantics. This is espacially true for the functions we
end up calling the thread for.
Signed-off-by: Donatas Abraitis <donatas.abraitis@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
This replaces the external libsystemd dependency with... pretty much the
same amount of built-in code. But with one fewer dependency and build
switch needed.
Also check `JOURNAL_STREAM` for future logging integration.
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
| |
Compile with v2.0.0 tag of `libyang2` branch of:
https://github.com/CESNET/libyang
staticd init load time of 10k routes now 6s vs ly1 time of 150s
Signed-off-by: Christian Hopps <chopps@labn.net>
|
|
|
|
|
|
| |
... again ...
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Back when I put this together in 2015, ISO C11 was still reasonably new
and we couldn't require it just yet. Without ISO C11, there is no
"good" way (only bad hacks) to require a semicolon after a macro that
ends with a function definition. And if you added one anyway, you'd get
"spurious semicolon" warnings on some compilers...
With C11, `_Static_assert()` at the end of a macro will make it so that
the semicolon is properly required, consumed, and not warned about.
Consistently requiring semicolons after "file-level" macros matches
Linux kernel coding style and helps some editors against mis-syntax'ing
these macros.
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
| |
We do not need to display: `Warning: ...` in a zlog_warn
message
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
`valid_command` now causes static analyzer complaints since it no
longer assumes `optarg` is non-NULL. If this was the case then
`valid_command` would return `false` (or 0) because it would mean the
string is empty and doesn't contain the '%s' it expects.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
|
|
|
|
|
|
| |
Fix `netns` command line handling for missing argument (it's optional).
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
|
|
|
|
|
|
|
|
|
| |
Replace all lib/thread cancel macros, use thread_cancel()
everywhere. Only the THREAD_OFF macro and thread_cancel() api are
supported. Also adjust thread_cancel_async() to NULL caller's pointer (if
present).
Signed-off-by: Mark Stapp <mjs@voltanet.io>
|
|
|
|
|
|
|
|
| |
Change thread_cancel to take a ** to an event, NULL-check
before dereferencing, and NULL the caller's pointer. Update
many callers to use the new signature.
Signed-off-by: Mark Stapp <mjs@voltanet.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have this pattern in the code base:
if (thread)
THREAD_OFF(thread);
If we look at THREAD_OFF we check to see if thread
is non-null too. So we have a double check.
This is unnecessary. Convert to just using THREAD_OFF
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
|
|
|
|
|
|
|
| |
This adds -N and --netns options to watchfrr, allowing it to start
daemons with -N and switching network namespaces respectively.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove mid-string line breaks, cf. workflow doc:
.. [#tool_style_conflicts] For example, lines over 80 characters are allowed
for text strings to make it possible to search the code for them: please
see `Linux kernel style (breaking long lines and strings)
<https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_
and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_.
Scripted commit, idempotent to running:
```
python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'`
```
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
| |
These are easy to get subtly wrong, and doing so can cause
nondeterministic failures when racing in parallel builds.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
|
|
|
|
|
|
| |
No need to put $(top_srcdir) everywhere.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Replace all `random()` calls with a function called `frr_weak_random()`
and make it clear that it is only supposed to be used for weak random
applications.
Use the annotation described by the Coverity Scan documentation to
ignore `random()` call warnings.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a full rewrite of the "back end" logging code. It now uses a
lock-free list to iterate over logging targets, and the targets
themselves are as lock-free as possible. (syslog() may have a hidden
internal mutex in the C library; the file/fd targets use a single
write() call which should ensure atomicity kernel-side.)
Note that some functionality is lost in this patch:
- Solaris printstack() backtraces are ditched (unlikely to come back)
- the `log-filter` machinery is gone (re-added in followup commit)
- `terminal monitor` is temporarily stubbed out. The old code had a
race condition with VTYs going away. It'll likely come back rewritten
and with vtysh support.
- The `zebra_ext_log` hook is gone. Instead, it's now much easier to
add a "proper" logging target.
v2: TLS buffer to get some actual performance
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
|
| |
When watchfrr starts up, it first tries to connect to daemons. This is
expected to fail if we are just starting up FRR, but we log it as an
error, and it shows up red in journalctl. Similarly when we fork
background commands that is also logged as an error. This is scaring
users, let's change these to info.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
|
|
|
|
|
|
|
|
| |
The vrrpd one conflicts with the standalone vrrpd package; also we're
installing daemons to /usr/lib/frr on some systems so they're not on
PATH.
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
| |
And memory_init() to lib_cmd_init().
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
| |
const const const your boat, merrily down the stream...
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow systemd to be informed about operational state so operators can
infer a bit about what is going on with FRR from the systemd status
cli.
sharpd@robot ~/frr4> systemctl status frr
● frr.service - FRRouting
Loaded: loaded (/usr/lib/systemd/system/frr.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2019-10-03 21:09:04 EDT; 7s ago
Docs: https://frrouting.readthedocs.io/en/latest/setup.html
Process: 32455 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS)
Status: "FRR Operational"
Tasks: 12 (limit: 4915)
Memory: 76.5M
CGroup: /system.slice/frr.service
├─32468 /usr/lib/frr/watchfrr -d zebra bgpd staticd
├─32487 /usr/lib/frr/zebra -d -A 127.0.0.1 -s 90000000
├─32492 /usr/lib/frr/bgpd -d -A 127.0.0.1
└─32500 /usr/lib/frr/staticd -d -A 127.0.0.1
Please note the `Status: ...` line above.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a fairly common state we are seeing where watchfrr
has decided that something is not right and is printing out
a `wtf` message. At this point I am not sure what is going on
or how we are getting here, but let's add a bit more data dump
to the message so that we can figure out what is going on.
This is mainly being done because at this point in time I have no
clue the what/how of how we got here and I cannot reproduce.
Maybe by adding more useful information here I can figure out what is
going on.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow an end user who is debugging behavior, with say gdb, to turn
off watchfrr and it's attempts to keep control of a daemons up/responsiveness
With code change:
donna.cumulusnetworks.com# show watchfrr
watchfrr global phase: Idle
zebra Up
bgpd Up/Ignoring Timeout
staticd Up
Now grab bgpd with gdb:
sharpd@donna ~/frr4> date ; sudo gdb -p 27893
Mon 16 Sep 2019 01:44:57 PM EDT
GNU gdb (GDB) Fedora 8.3-6.fc30
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 27893
[New LWP 27894]
[New LWP 27895]
[New LWP 27896]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f1787a3e5c7 in poll () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29-15.fc30.x86_64 gperftools-libs-2.7-5.fc30.x86_64 json-c-0.13.1-4.fc30.x86_64 libcap-2.26-5.fc30.x86_64 libgcc-9.1.1-1.fc30.x86_64 libgcrypt-1.8.4-3.fc30.x86_64 libgpg-error-1.33-2.fc30.x86_64 libstdc++-9.1.1-1.fc30.x86_64 libxcrypt-4.4.6-2.fc30.x86_64 libyang-0.16.105-1.fc30.x86_64 lua-libs-5.3.5-5.fc30.x86_64 lz4-libs-1.8.3-2.fc30.x86_64 pcre-8.43-2.fc30.x86_64 xz-libs-5.2.4-5.fc30.x86_64
(gdb)
In another window we can see when watchfrr thinks it's not responding:
donna.cumulusnetworks.com# show watchfrr
watchfrr global phase: Idle
zebra Up
bgpd Unresponsive/Ignoring Timeout
staticd Up
Finally exit gdb and watchfrr now believes bgpd is good to go again:
donna.cumulusnetworks.com# show watchfrr
watchfrr global phase: Idle
zebra Up
bgpd Up/Ignoring Timeout
staticd Up
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
| |
Some types like `time_t` vary across platforms and always need to be
cast when printed.
Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
|
|
|
|
|
|
|
|
| |
When the user specifies -N namespace allow it to influence the
frr_vtydir(DAEMON_VTY_DIR) to have namespace in it's path
like so: $frrstate_dir/<namespace>
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
| |
In a variety of places we are using DAEMON_VTY_DIR, convert
to use frr_vtydir. This will allow us in a future commit
to have the -N namespace option be automatically used.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
|
|
|
|
|
|
|
|
|
| |
There's no good reason to not have these options default to the
installation path of tools/watchfrr.sh. Doing so allows us to ditch
watchfrr_options from daemons/daemons.conf completely.
Fixes: #3652
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
| |
If we wait forever for all daemons to come up, we can hang the entire
boot process, especially on init.d based systems.
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
| |
- some target_CFLAGS that needed to include AM_CFLAGS didn't do so
- libyang/sysrepo/sqlite3/confd CFLAGS + LIBS weren't used at all
- consistently use $(FOO_CFLAGS) instead of @FOO_CFLAGS@
- 2 dependencies were missing for clippy
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
| |
Just to see WTF is going on inside watchfrr...
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we make a call to (re)start some daemon(s), we can immediately try
connecting to its VTY socket after the script completes. If the daemon
started correctly, this will always succeed since the start script only
returns after daemon startup is complete.
Among other things, this reduces the delay to "startup complete"
notification at initial watchfrr start.
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
| |
We were waiting for timers to expire even when we already know the
status of all daemons. This delays startup for no good reason.
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
|
|
| |
This cleans up watchfrr to be more "normal" like the other daemons in
terms of what it does in main(), i.e. using the full frr_*() call set.
Also, this changes the startup behaviour on watchfrr to stay attached on
the daemon's parent process until startup is really complete. This
should allow removing the "watchfrr.started" hack at some point.
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The stderr output is not being displayed as part of watchfrr invocation
in system startup. Specifically if the user has not properly sent
1 or more daemons to monitor. If the end-user is using tools/frr
this stderr is dropped( and systemd appears to drop stderr too? )
Modify the two stderr calls in this situation and use the zlog system.
Now I can clearly see an error message that tells me what has gone wrong.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
[DL: fixed typo]
|
|
|
|
| |
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
|
|
|
|
| |
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
|
|
|
|
| |
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
|
|
|
|
|
|
| |
Can't build manpages without sphinx-build, oops...
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
| |
Signed-off-by: David Lamparter <equinox@diac24.net>
|
|
|
|
|
|
|
| |
Since we're now building through one large Makefile, we can easily put
things with their daemons and crossreference nicely.
Signed-off-by: David Lamparter <equinox@diac24.net>
|