summaryrefslogtreecommitdiffstats
path: root/msg.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Revert "mdadm: Fix socket connection failure when mdmon runs in foreground ↵Mariusz Tkaczyk2024-06-211-19/+1
| | | | | | | | | | | | mode." This reverts commit 66a54b266f6c579e5f37b6253820903a55c3346c. connect_monitor() is called from ping_monitor() but this function is often used as advice, without verification that mdmon is really working. This produces hangs in many scenarios. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm: Fix socket connection failure when mdmon runs in foreground mode.Shminderjit Singh2024-06-181-1/+19
| | | | | | | | | | | | | | | | | | | | While creating an IMSM RAID, mdadm will wait for the mdmon main process to finish if mdmon runs in forking mode. This is because with "Type=forking" in the mdmon service unit file, "systemctl start service" will block until the main process of mdmon exits. At that moment, mdmon has already created the socket, so the subsequent socket connect from mdadm will succeed. However, when mdmon runs in foreground mode (without "Type=forking" in the service unit file), "systemctl start service" will return once the mdmon process starts. This causes mdadm and mdmon to run in parallel, which may lead to a socket connection failure since mdmon has not yet initialized the socket when mdadm tries to connect. If the next instruction/command is to access this device and try to write to it, a permission error will occur since mdmon has not yet set the array to RW mode. Signed-off-by: Shminderjit Singh <shminderjit.singh@oracle.com>
* Define sysfs max buffer sizeMateusz Kusiak2024-01-241-2/+2
| | | | | | | | | | | sysfs_get_str() usages have inconsistant buffer size. This results in wild buffer declarations and redundant memory usage. Define maximum buffer size for sysfs strings. Replace wild sysfs string buffer sizes for globaly defined value. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* Monitor/msg: Don't print error message if mdmon doesn't runMariusz Tkaczyk2017-11-211-2/+0
| | | | | | | | | | | | | | | | Commit 4515fb28a53a ("Add detail information when can not connect monitor") was added to warn about failed connection to monitor in WaitClean function (see link below). Mdmon runs for IMSM containers when they have array with redundancy so if mdmon doesn't run, mdadm prints this error. This is misleading and unnecessary. Just print it in WaitClean function. The sock in WaitClean is deprecated so it is removed. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1375002 Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Add detail information when can not connect monitorXiao Ni2017-01-091-0/+2
| | | | | | | | | | If it can't connect monitor, now the error message is just Error waiting for xxx to be clean. Add detail error message in connect_monitor. Suggested-by: Oleg Samarin <osamarin68@gmail.com> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
* Add casts for the addr arg of connect and bindKhem Raj2016-01-141-1/+1
| | | | | | | | | | glibc allows the addr arg to connect and socket to be any of a number of 'sockaddr_*' types, but musl requires 'const struct sockaddr *' which is in line with open group specs. So add casts to allow compilation with musl. Signed-off-by: Khem Raj <raj.khem@gmail.com> Signed-off-by: NeilBrown <neilb@suse.com>
* mdstat: discard 'dev' field, just use 'devnm'NeilBrown2015-07-021-2/+2
| | | | | | | | These both have the same value, and have done since the 'devnm' concept was introduced. So discard the pointless duplicate. Signed-off-by: NeilBrown <neilb@suse.de>
* Remove lots of unnecessary white space.NeilBrown2013-06-191-2/+1
| | | | | | | Now that I am using white-space mode in Emacs I can see all of this, and I don't like it :-) Signed-off-by: NeilBrown <neilb@suse.de>
* Discard devnum in favour of devnmNeilBrown2013-02-211-22/+5
| | | | | | | | | | | | | | We widely use a "devnum" which is 0 or +ve for md%d devices and -ve for md_d%d devices. But I want to be able to use md_%s device names. So get rid of devnum (a number) and use devnm (a 32char string). eg. md0 md_d2 md_home Signed-off-by: NeilBrown <neilb@suse.de>
* Remove scattered checks for malloc success.NeilBrown2012-07-091-3/+1
| | | | | | | | | | | | | | malloc should never fail, and if it does it is unlikely that anything else useful can be done. Best approach is to abort and let some super-daemon restart. So define xmalloc, xcalloc, xrealloc, xstrdup which don't fail but just print a message and exit. Then use those removing all the tests for failure. Also replace all "malloc;memset" sequences with 'xcalloc'. Signed-off-by: NeilBrown <neilb@suse.de>
* Introduce pr_err for printing error messages.NeilBrown2012-07-091-13/+9
| | | | | | | 'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": ' cont_err() is also available. Signed-off-by: NeilBrown <neilb@suse.de>
* Flush mdmon before next reshape step during container operationAdam Kwolek2012-02-091-0/+10
| | | | | | | | | | | Using takeover operation for grow purposes, mdadm has to be sure that mdmon processes all updates, and if necessary it will be closed at takeover to raid0 operation. If mdmon is late, next array in container is processed and due to race condition mdmon closes itself instead to monitor next reshape operation. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* unblock_monitor(): Check sra is valid before dereferencingJes Sorensen2011-11-021-0/+2
| | | | | Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
* ping_monitor(): check file descriptor is valid before using and closing itJes Sorensen2011-11-021-2/+7
| | | | | Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Move code to check_mdmon_version() functionAdam Kwolek2011-10-031-23/+36
| | | | | | | Move code to function for code reuse. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* FIX: ping_monitor() usage causes memory leaksAdam Kwolek2011-03-181-0/+14
| | | | | | | | | | When for ping_monitor() input devnum2devname() is used, received string pointer should be passed to free() for memory release. It is not made in several places. This use case should have function to avoid memory leak. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Add block_subarray()Adam Kwolek2011-03-021-4/+14
| | | | | | | | | | Put code for blocking subarray in to separate function. This little code/function will be used for blocking arrays from mdmon monitoring during assembly process. Arrays cannot wait for container assembly finish, because meanwhile monitor can enable arrays for writing. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Remove stray 'free' in block_monitor.NeilBrown2010-12-201-1/+0
| | | | | | | | This value is passed in by caller so we should not be freeing it. Reported-by: "Wojcik, Krzysztof" <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Grow: be extra careful about races when freezing an arrayNeilBrown2010-12-151-1/+12
| | | | | | | If any subarray has any spare devices, then something raced, and we should abort the reshape. Signed-off-by: NeilBrown <neilb@suse.de>
* FIX: Cannot exit monitor after takeoverAdam Kwolek2010-12-031-2/+6
| | | | | | | | | | | When performing backward takeover to raid0 monitor cannot exit for single raid0 array configuration. Monitor is locked by communication (ping_manager()) after unfreeze() Do not ping manager for raid0 array as they shouldn't be monitored. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Improve comments for block_monitor.NeilBrown2010-11-291-3/+12
| | | | | | | Also not that the leading '-' on the metadata names now simply means that mdmon must not reconfiure the array. Signed-off-by: NeilBrown <neilb@suse.de>
* block monitor: freeze spare assignment for external arraysDan Williams2010-11-231-2/+193
| | | | | | | | | | | | | | | | | | | | | | | In order to support reshape and atomic removal of spares from containers we need to prevent mdmon from activating spares. In the reshape case we additionally need to freeze sync_action while the reshape transaction is initiated with the kernel and recorded in the metadata. When reshaping a raid0 array we need to freeze the array *before* it is transitioned to a redundant raid level. Since sync_action does not exist at this point we extend the '-' prefix of a subarray string to flag mdmon not to activate spares. Mdadm needs to be reasonably certain that the version of mdmon in the system honors this 'freeze' indication. If mdmon is not already active then we assume the version that gets started is the same as the mdadm version. Otherwise, we check the version of mdmon as returned by the extended ping_monitor() operation. This is to catch cases where mdadm is upgraded in the filesystem, but mdmon started in the initramfs is from a previous release. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* Fix all the confusion over directories once and for all.Doug Ledford2010-07-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now have 3 directory definitions: mdmon directory for its pid and sock files (compile time define, not changable at run time), mdmonitor directory which is for the mdadm monitor mode pid file (can only be passed in via command line at the time mdadm is invoked in monitor mode), and the directory for the mdadm incremental assembly map file (compile time define, not changable at run time). Only the mdadm map file still hunts multiple locations, and the number of locations has been reduced to /var/run and the compile time specified location. Re-use of similar sounding defines that actually didn't denote their actual usage at compile time made it more difficult for a person to know what affect changing the compile time defines would have on the resulting programs. This patch renames the various defines to clearly identify which item the define affects. It also reduces the number of various directories which will be searched for these files as this has lead to confusion in mdadm and mdmon in terms of which files should take precedence when files exist in multiple locations, etc. It's best if the person compiling the program intentionally and with planning selects the right directories to be used for the various purposes. Which directory is right depends on which items you are talking about and what boot loader your system uses and what initramfs generation program your system uses. Because of the inter-dependency of all these items it would typically be up to the distribution that mdadm is being integrated into to select the correct values for these defines. Signed-off-by: Doug Ledford <dledford@redhat.com>
* fix mdmon takeoverLuca Berra2010-03-031-1/+1
| | | | | | | | | | | - when we waited for the old mdmon to exit, we didn't look for the socket in the right place - when we failed to find a pid file, we returned the wrong value (code expected <0, but got ==0). Signed-off-by: Luca Berra <bluca@comedia.it> Signed-off-by: NeilBrown <neilb@suse.de>
* mdmon: allow pid to be stored in different directory.NeilBrown2010-02-041-1/+1
| | | | | | | | /var/run probably doesn't persist from early boot. So if necessary, store in in /lib/init/rw or somewhere else that does persist. Signed-off-by: NeilBrown <neilb@suse.de>
* mdmon: preserve socket over chrootDan Williams2009-10-141-3/+11
| | | | | | | | | | Connect to the monitor in the old namespace and use that connection for WaitClean requests when stopping the victim mdmon instance. This allows ping_monitor() to work post chroot(). Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* ping_manager() to prevent 'add' before 'remove' completesDan Williams2008-09-161-4/+29
| | | | | | | | | | | | It is currently possible to remove a device and re-add it without the manager noticing, i.e. without detecting a mdstat->devcnt container->devcnt mismatch. Introduce ping_manager() to arrange for mdmon to run manage_container() prior to mdadm dropping the exclusive open() on the container. Despite these precautions sysfs_read() may still fail. If this happens invalidate container->devcnt to ensure manage_container() runs at the next event. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* Add ping_monitor() to mdadm --waitDan Williams2008-09-161-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The action we are waiting for may not be complete until the monitor has had a chance to take action on the result. The following script can now remove the device on the first attempt, versus a few attempts with the original Wait(): #!/bin/bash #export MDADM_NO_MDMON=1 export IMSM_DEVNAME_AS_SERIAL=1 ./mdadm -Ss ./mdadm --zero-superblock /dev/loop[0-3] echo 2 > /proc/sys/dev/raid/speed_limit_max ./mdadm --create /dev/imsm /dev/loop[0-3] -n 4 -e imsm -a md ./mdadm --create /dev/md/r1 /dev/loop[0-3] -n 4 -l 5 --force -a mdp ./mdadm --fail /dev/md/r1 /dev/loop3 ./mdadm --wait /dev/md/r1 x=0 while ! ./mdadm --remove /dev/imsm /dev/loop3 > /dev/null 2>&1 do x=$((x+1)) done echo "removed after $x attempts" ./mdadm --add /dev/imsm /dev/loop3 Include 2 small cleanups: * remove the almost open coded fd2devnum() in Wait() by introducing a new utility routine stat2devnum() * teach connect_monitor() to parse the container device from a subarray string Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* msg: add a timeout to ping_monitorNeilBrown2008-07-181-2/+2
| | | | | | | Though it should never bee needed, having a timeout in ping_monitor is a sensible safeguard. Signed-off-by: Neil Brown <neilb@suse.de>
* Revise message passing code.Neil Brown2008-07-121-141/+90
| | | | More here
* Remove mgr_pipe for communicating from manage to monitor.Neil Brown2008-07-121-1/+1
| | | | | Data is being passed in shared memory, so the pipe is only being use as a wakeup. This can more easily be done with a thread-signal.
* Handle device removal from containerNeil Brown2008-07-121-13/+0
| | | | | | | This really should be done in mdadm, not mdmon. We ensure the device won't be suddenly commited as a hot-spare using O_EXCL, then check the 'holders' sysfs directory to make sure it is only in use once.
* handle Manage_subdevs() for 'external' arraysDan Williams2008-05-151-0/+249
From: Dan Williams <dan.j.williams@intel.com> 1/ Block attempts to add/remove devices from container members 2/ Forward add/remove requests to containers Signed-off-by: Dan Williams <dan.j.williams@intel.com>