summaryrefslogtreecommitdiffstats
path: root/watchfrr/watchfrr.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* lib, watchfrr: remove `HAVE_SYSTEMD`, use own codeDavid Lamparter2021-06-291-7/+3
| | | | | | | | | | This replaces the external libsystemd dependency with... pretty much the same amount of built-in code. But with one fewer dependency and build switch needed. Also check `JOURNAL_STREAM` for future logging integration. Signed-off-by: David Lamparter <equinox@diac24.net>
* *: require semicolon after FRR_DAEMON_INFO & co.David Lamparter2021-03-171-1/+2
| | | | | | ... again ... Signed-off-by: David Lamparter <equinox@diac24.net>
* *: require semicolon after DEFINE_MTYPE & coDavid Lamparter2021-03-171-2/+2
| | | | | | | | | | | | | | | | | Back when I put this together in 2015, ISO C11 was still reasonably new and we couldn't require it just yet. Without ISO C11, there is no "good" way (only bad hacks) to require a semicolon after a macro that ends with a function definition. And if you added one anyway, you'd get "spurious semicolon" warnings on some compilers... With C11, `_Static_assert()` at the end of a macro will make it so that the semicolon is properly required, consumed, and not warned about. Consistently requiring semicolons after "file-level" macros matches Linux kernel coding style and helps some editors against mis-syntax'ing these macros. Signed-off-by: David Lamparter <equinox@diac24.net>
* watchfrr: Convert to not use warning in warning messagesDonald Sharp2021-03-101-3/+3
| | | | | | | We do not need to display: `Warning: ...` in a zlog_warn message Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* watchfrr: fix SA warningRafael Zalamena2021-01-261-0/+3
| | | | | | | | | `valid_command` now causes static analyzer complaints since it no longer assumes `optarg` is non-NULL. If this was the case then `valid_command` would return `false` (or 0) because it would mean the string is empty and doesn't contain the '%s' it expects. Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
* watchfrr: fix crash on missing optional argumentRafael Zalamena2021-01-251-1/+1
| | | | | | Fix `netns` command line handling for missing argument (it's optional). Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
* *: unify thread/event cancel macrosMark Stapp2020-10-231-0/+1
| | | | | | | | | Replace all lib/thread cancel macros, use thread_cancel() everywhere. Only the THREAD_OFF macro and thread_cancel() api are supported. Also adjust thread_cancel_async() to NULL caller's pointer (if present). Signed-off-by: Mark Stapp <mjs@voltanet.io>
* * : update signature of thread_cancel apiMark Stapp2020-10-231-7/+5
| | | | | | | | Change thread_cancel to take a ** to an event, NULL-check before dereferencing, and NULL the caller's pointer. Update many callers to use the new signature. Signed-off-by: Mark Stapp <mjs@voltanet.io>
* *: Use proper semantics for turning off threadDonald Sharp2020-10-121-2/+1
| | | | | | | | | | | | | We have this pattern in the code base: if (thread) THREAD_OFF(thread); If we look at THREAD_OFF we check to see if thread is non-null too. So we have a double check. This is unnecessary. Convert to just using THREAD_OFF Signed-off-by: Donald Sharp <sharpd@nvidia.com>
* watchfrr: add (network) namespace supportDavid Lamparter2020-07-221-3/+178
| | | | | | | This adds -N and --netns options to watchfrr, allowing it to start daemons with -N and switching network namespaces respectively. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
* *: un-split strings across linesDavid Lamparter2020-07-141-26/+13
| | | | | | | | | | | | | | | | | Remove mid-string line breaks, cf. workflow doc: .. [#tool_style_conflicts] For example, lines over 80 characters are allowed for text strings to make it possible to search the code for them: please see `Linux kernel style (breaking long lines and strings) <https://www.kernel.org/doc/html/v4.10/process/coding-style.html#breaking-long-lines-and-strings>`_ and `Issue #1794 <https://github.com/FRRouting/frr/issues/1794>`_. Scripted commit, idempotent to running: ``` python3 tools/stringmangle.py --unwrap `git ls-files | egrep '\.[ch]$'` ``` Signed-off-by: David Lamparter <equinox@diac24.net>
* *: replace all random() callsRafael Zalamena2020-04-181-1/+2
| | | | | | | | | | | Replace all `random()` calls with a function called `frr_weak_random()` and make it clear that it is only supposed to be used for weak random applications. Use the annotation described by the Coverity Scan documentation to ignore `random()` call warnings. Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
* lib: rewrite zlog lock-free & TLS-bufferedDavid Lamparter2020-04-011-3/+3
| | | | | | | | | | | | | | | | | | | | | This is a full rewrite of the "back end" logging code. It now uses a lock-free list to iterate over logging targets, and the targets themselves are as lock-free as possible. (syslog() may have a hidden internal mutex in the C library; the file/fd targets use a single write() call which should ensure atomicity kernel-side.) Note that some functionality is lost in this patch: - Solaris printstack() backtraces are ditched (unlikely to come back) - the `log-filter` machinery is gone (re-added in followup commit) - `terminal monitor` is temporarily stubbed out. The old code had a race condition with VTYs going away. It'll likely come back rewritten and with vtysh support. - The `zebra_ext_log` hook is gone. Instead, it's now much easier to add a "proper" logging target. v2: TLS buffer to get some actual performance Signed-off-by: David Lamparter <equinox@diac24.net>
* watchfrr: change some messages from errors to infoQuentin Young2020-03-301-6/+5
| | | | | | | | | | When watchfrr starts up, it first tries to connect to daemons. This is expected to fail if we are just starting up FRR, but we log it as an error, and it shows up red in journalctl. Similarly when we fork background commands that is also logged as an error. This is scaring users, let's change these to info. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* lib: rename memory_vty.c to lib_vty.cDavid Lamparter2019-12-061-1/+0
| | | | | | And memory_init() to lib_cmd_init(). Signed-off-by: David Lamparter <equinox@diac24.net>
* *: generously apply constDavid Lamparter2019-12-021-2/+2
| | | | | | const const const your boat, merrily down the stream... Signed-off-by: David Lamparter <equinox@diac24.net>
* lib, watchfrr: Add some additional status messages to systemdDonald Sharp2019-10-041-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | Allow systemd to be informed about operational state so operators can infer a bit about what is going on with FRR from the systemd status cli. sharpd@robot ~/frr4> systemctl status frr ● frr.service - FRRouting Loaded: loaded (/usr/lib/systemd/system/frr.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2019-10-03 21:09:04 EDT; 7s ago Docs: https://frrouting.readthedocs.io/en/latest/setup.html Process: 32455 ExecStart=/usr/lib/frr/frrinit.sh start (code=exited, status=0/SUCCESS) Status: "FRR Operational" Tasks: 12 (limit: 4915) Memory: 76.5M CGroup: /system.slice/frr.service ├─32468 /usr/lib/frr/watchfrr -d zebra bgpd staticd ├─32487 /usr/lib/frr/zebra -d -A 127.0.0.1 -s 90000000 ├─32492 /usr/lib/frr/bgpd -d -A 127.0.0.1 └─32500 /usr/lib/frr/staticd -d -A 127.0.0.1 Please note the `Status: ...` line above. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* watchfrr: Convert `wtf` to a more meaningful messageDonald Sharp2019-09-161-1/+3
| | | | | | | | | | | | | | | There is a fairly common state we are seeing where watchfrr has decided that something is not right and is printing out a `wtf` message. At this point I am not sure what is going on or how we are getting here, but let's add a bit more data dump to the message so that we can figure out what is going on. This is mainly being done because at this point in time I have no clue the what/how of how we got here and I cannot reproduce. Maybe by adding more useful information here I can figure out what is going on. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com.
* watchfrr: Allow end users to turn off watchfrr for a particular daemonDonald Sharp2019-09-161-1/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow an end user who is debugging behavior, with say gdb, to turn off watchfrr and it's attempts to keep control of a daemons up/responsiveness With code change: donna.cumulusnetworks.com# show watchfrr watchfrr global phase: Idle zebra Up bgpd Up/Ignoring Timeout staticd Up Now grab bgpd with gdb: sharpd@donna ~/frr4> date ; sudo gdb -p 27893 Mon 16 Sep 2019 01:44:57 PM EDT GNU gdb (GDB) Fedora 8.3-6.fc30 Copyright (C) 2019 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". Attaching to process 27893 [New LWP 27894] [New LWP 27895] [New LWP 27896] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". 0x00007f1787a3e5c7 in poll () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.29-15.fc30.x86_64 gperftools-libs-2.7-5.fc30.x86_64 json-c-0.13.1-4.fc30.x86_64 libcap-2.26-5.fc30.x86_64 libgcc-9.1.1-1.fc30.x86_64 libgcrypt-1.8.4-3.fc30.x86_64 libgpg-error-1.33-2.fc30.x86_64 libstdc++-9.1.1-1.fc30.x86_64 libxcrypt-4.4.6-2.fc30.x86_64 libyang-0.16.105-1.fc30.x86_64 lua-libs-5.3.5-5.fc30.x86_64 lz4-libs-1.8.3-2.fc30.x86_64 pcre-8.43-2.fc30.x86_64 xz-libs-5.2.4-5.fc30.x86_64 (gdb) In another window we can see when watchfrr thinks it's not responding: donna.cumulusnetworks.com# show watchfrr watchfrr global phase: Idle zebra Up bgpd Unresponsive/Ignoring Timeout staticd Up Finally exit gdb and watchfrr now believes bgpd is good to go again: donna.cumulusnetworks.com# show watchfrr watchfrr global phase: Idle zebra Up bgpd Up/Ignoring Timeout staticd Up Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* *: fix some dumb printf format warningsDavid Lamparter2019-06-111-4/+5
| | | | | | | Some types like `time_t` vary across platforms and always need to be cast when printed. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
* *: Convert over to all -N namespace to change DAEMON_VTY_DIRDonald Sharp2019-06-051-1/+1
| | | | | | | | When the user specifies -N namespace allow it to influence the frr_vtydir(DAEMON_VTY_DIR) to have namespace in it's path like so: $frrstate_dir/<namespace> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* *: Convert to using frr_vtydir instead of DAEMON_VTY_DIRDonald Sharp2019-06-041-1/+4
| | | | | | | | In a variety of places we are using DAEMON_VTY_DIR, convert to use frr_vtydir. This will allow us in a future commit to have the -N namespace option be automatically used. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* watchfrr: build in defaults for -r/-s/-kDavid Lamparter2019-02-191-1/+13
| | | | | | | | | There's no good reason to not have these options default to the installation path of tools/watchfrr.sh. Doing so allows us to ditch watchfrr_options from daemons/daemons.conf completely. Fixes: #3652 Signed-off-by: David Lamparter <equinox@diac24.net>
* watchfrr: don't wait forever at startupDavid Lamparter2019-02-191-11/+38
| | | | | | | If we wait forever for all daemons to come up, we can hang the entire boot process, especially on init.d based systems. Signed-off-by: David Lamparter <equinox@diac24.net>
* watchfrr: add status commandDavid Lamparter2018-12-061-1/+26
| | | | | | Just to see WTF is going on inside watchfrr... Signed-off-by: David Lamparter <equinox@diac24.net>
* watchfrr: immediately try connecting after startDavid Lamparter2018-12-061-1/+24
| | | | | | | | | | | | When we make a call to (re)start some daemon(s), we can immediately try connecting to its VTY socket after the script completes. If the daemon started correctly, this will always succeed since the start script only returns after daemon startup is complete. Among other things, this reduces the delay to "startup complete" notification at initial watchfrr start. Signed-off-by: David Lamparter <equinox@diac24.net>
* watchfrr: don't wait around pointlessly at startupDavid Lamparter2018-12-061-4/+21
| | | | | | | We were waiting for timers to expire even when we already know the status of all daemons. This delays startup for no good reason. Signed-off-by: David Lamparter <equinox@diac24.net>
* watchfrr, lib: cleanup & delay detachingDavid Lamparter2018-10-021-100/+65
| | | | | | | | | | | This cleans up watchfrr to be more "normal" like the other daemons in terms of what it does in main(), i.e. using the full frr_*() call set. Also, this changes the startup behaviour on watchfrr to stay attached on the daemon's parent process until startup is really complete. This should allow removing the "watchfrr.started" hack at some point. Signed-off-by: David Lamparter <equinox@diac24.net>
* watchfrr: Modify some stderr messages to zlog_warnDonald Sharp2018-09-251-3/+4
| | | | | | | | | | | | | The stderr output is not being displayed as part of watchfrr invocation in system startup. Specifically if the user has not properly sent 1 or more daemons to monitor. If the end-user is using tools/frr this stderr is dropped( and systemd appears to drop stderr too? ) Modify the two stderr calls in this situation and use the zlog system. Now I can clearly see an error message that tells me what has gone wrong. Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> [DL: fixed typo]
* *: style for EC replacementsQuentin Young2018-09-131-12/+12
| | | | Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* *: LIB_[ERR|WARN] -> EC_LIBQuentin Young2018-09-131-9/+9
| | | | Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* watchfrr: WATCHFRR_[ERR|WARN] -> EC_WATCHFRRQuentin Young2018-09-131-5/+5
| | | | Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* watchfrr: fix global restartChristian Franke2018-08-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | watchfrr needs to handle a SIGCHLD also when it calls a global restart command. Before this patch, it would lead to the following behavior: 15:44:28: zebra state -> down : unexpected read error: Connection reset by peer 15:44:33: Forked background command [pid 6392]: /usr/sbin/frr.init watchrestart all 15:44:53: Warning: restart all child process 6392 still running after 20 seconds, sending signal 15 15:44:53: waitpid returned status for an unknown child process 6392 15:44:53: background (unknown) process 6392 terminated due to signal 15 15:45:13: Warning: restart all child process 6392 still running after 40 seconds, sending signal 9 15:45:33: Warning: restart all child process 6392 still running after 60 seconds, sending signal 9 15:45:53: Warning: restart all child process 6392 still running after 80 seconds, sending signal 9 15:46:13: Warning: restart all child process 6392 still running after 100 seconds, sending signal 9 15:46:33: Warning: restart all child process 6392 still running after 120 seconds, sending signal 9 15:46:53: Warning: restart all child process 6392 still running after 140 seconds, sending signal 9 This is obviously incorrect and can be fixed by comparing the pid to the global restart object as well. Signed-off-by: Christian Franke <chris@opensourcerouting.org>
* *: rename ferr_zlog -> flog_err_sysQuentin Young2018-08-141-27/+28
| | | | Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* *: rename zlog_fer -> flog_errQuentin Young2018-08-141-14/+14
| | | | Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* watchfrr: Add WATCHFRR_ERR_XXX for zlog_err to zlog_ferrDonald Sharp2018-08-141-32/+46
| | | | Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* *: use C99 standard fixed-width integer typesQuentin Young2018-03-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | The following types are nonstandard: - u_char - u_short - u_int - u_long - u_int8_t - u_int16_t - u_int32_t Replace them with the C99 standard types: - uint8_t - unsigned short - unsigned int - unsigned long - uint8_t - uint16_t - uint32_t Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* watchfrr, vtysh: do not write config during crashQuentin Young2018-03-211-0/+10
| | | | | | | | | | | | | | | | | If a daemon is restarting, crashed, or otherwise in the process of reconnecting to watchfrr and a user issues "write memory" or "write file" the resulting config will not include the configuration of that daemon. This is problematic because this output will overwrite the previous config, potentially causing unintentional loss of configuration stored only in the config file based upon timing. This patch remedies that by making watchfrr check that all daemons are up before attempting a configuration write, and updating vtysh so that its failsafe respects this condition as well. Note that this issue only manifests when using integrated config. Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
* Merge pull request #1514 from donaldsharp/watchfrrMartin Winter2017-12-121-1/+1
|\ | | | | tools, watchfrr: Modify timeout to 90 seconds
| * tools, watchfrr: Modify timeout to 90 secondsBrian Rak2017-12-041-1/+1
| | | | | | | | | | | | | | | | The default timeout of 10 seconds is too quick of a timeout given some long running cli commands. Modify watchfrr to have a 90s timeout value instead. Signed-off-by: Brian Rak <brianrak@gameservers.com>
* | watchfrr: Fail gracefully if fopen failsDonald Sharp2017-12-051-1/+2
|/ | | | Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
* Merge pull request #892 from opensourcerouting/watchfrr-simplifyDonald Sharp2017-08-091-247/+83
|\ | | | | simplify watchfrr, add --terminal, improve startup logging
| * watchfrr: print specific error for removed optionsDavid Lamparter2017-08-091-1/+11
| | | | | | | | | | | | ... and document them in the man page. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
| * watchfrr: remove STATEDIR preprocessor defineDavid Lamparter2017-08-091-14/+8
| | | | | | | | | | | | use frr_vtydir from libfrr instead. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
| * doc: update watchfrr manpageDavid Lamparter2017-08-021-2/+1
| | | | | | | | | | | | | | Remove -R, -a, -A, -e and -z options. Also remove blocker in the code that refuses to start if --dry is given together with -k / -s / -r. Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
| * watchfrr: remove -z optionDavid Lamparter2017-08-021-16/+6
| | | | | | | | | | | | Why would we not want to restart a daemon that's hanging? Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
| * watchfrr: remove -e optionDavid Lamparter2017-08-021-12/+2
| | | | | | | | | | | | Why would we not want to PING? Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
| * watchfrr: remove abundance of modesDavid Lamparter2017-08-021-210/+63
| | | | | | | | | | | | | | This leaves what were previously modes 0 (monitor-only) and 3 (restart daemons individually, but restart everything if zebra is restarted). Signed-off-by: David Lamparter <equinox@opensourcerouting.org>
* | watchfrr: hide systemd message if not systemd availableJorge Boncompte2017-08-041-0/+2
|/ | | | Signed-off-by: Jorge Boncompte <jbonor@gmail.com>
* *: reindentreindent-master-afterwhitespace / reindent2017-07-171-244/+252
| | | | | | indent.py `git ls-files | pcregrep '\.[ch]$' | pcregrep -v '^(ldpd|babeld|nhrpd)/'` Signed-off-by: David Lamparter <equinox@opensourcerouting.org>