summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Including missing xmalloc header in raid6check to fix FTBFS.HEADmainDaniel Baumann42 hours1-0/+2
| | | | Signed-off-by: Daniel Baumann <daniel@debian.org>
* test: return fail if any failedMariusz Tkaczyk44 hours1-8/+12
| | | | | | GH action status should be failed if any test failed. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* test: Log execution timeMariusz Tkaczyk44 hours1-0/+13
| | | | | | | To start optymalizing test suite, we need to know which tests are the most time consuming. Log execution time after every test. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* imsm: fix tpv drvies check in add_to_superBlazej Kucman2 days1-1/+2
| | | | | | | | | | | | | | | | Before the mentioned patch, the check to verify if IMSM on current platform supports a use of TPV (other than Intel) disk, was only performed for non-Intel disks, after it is performed for all. This change causes inability to use any disk when platform does not support TPV drives, attempt results in the following error. mdadm: Platform configuration does not support non-Intel NVMe drives. Please refer to Intel(R) RSTe/VROC user guide. This change restores the check if the disk is non-Intel. Fixes: 734e7db4dfc5 ("imsm: Remove warning and refactor add_to_super_imsm code") Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
* tests: fix "foreign" verification for nameing tests.Mariusz Tkaczyk4 days2-3/+5
| | | | | | | | | | Mdadm supports DEVNODE in multiple form, we cannot trust that because it does not always reflect name in metadata. Tests are defining clear expectations- we must use them. Do foreign verification against WANTED_NAME instead of passed DEVNODE. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* platform-intel: fix buffer overflowBlazej Kucman4 days1-2/+2
| | | | | | | | | | | | | | | | | mdadm -C /dev/md/imsm0 -e imsm -n 2 /dev/nvme5n1 /dev/nvme4n1 -R mdadm -C /dev/md/r0d2 -l 0 -n 2 /dev/nvme5n1 /dev/nvme4n1 -R *** buffer overflow detected ***: terminated Aborted (core dumped) Issue is related to D_FORTIFY_SOURCE=3 flag and depends on environment, especially compiler version. In function active_arrays_by_format length of path buffer is calculated dynamically based on parameters, while PATH_MAX is used in snprintf, this is my lead to buffer overflow. It is fixed by change dynamic length calculation, to use define PATH_MAX for path length. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
* CI: run mdadm tests on test scripts changeKinga Stefaniuk7 days1-0/+2
| | | | | | Run mdadm tests scope on every change related to test files. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* debug: add timestamps for debug messagesMateusz Kusiak7 days3-9/+13
| | | | | | | | | | Timestamps on debug messages help establish what takes long to process. Debug messages are print only if DDEBUG flag is passed. Add timestamps for debug messages. Remove dead code from dprintf dummies for non-debug builds. Remove timestamps from current debug messages. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
* CI: assign ret to numeric valueKinga Stefaniuk11 days1-1/+2
| | | | | | | Use variable to store tests exit status. Return its value when test script finished. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* README: Rephrase mailing list chapterMariusz Tkaczyk2024-11-141-4/+4
| | | | | | | As suggested by Dan, make it sounds more welcomed. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* CI: use self-hosted runner to run testsKinga Stefaniuk2024-11-142-0/+68
| | | | | | Use prepared VM machine in GitHub actions to run mdadm tests on it. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* func.sh: do not hang when grow-continue can't finishKinga Stefaniuk2024-11-132-9/+38
| | | | | | | | When grow-continue process is ongoing, sync_action indicates that recovery is in progress. If grow-continue does not finish, even if sync_action is not "reshape" anymore, the test should fail. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* Fix 07reshape5initr testKinga Stefaniuk2024-11-131-1/+15
| | | | | | | | This test could hang if "check" action is not written to sync_action. If this value didn't appear, test hanged on infinite while loop. Add 5 second timeout to loop. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* imsm: add print license for VMDBlazej Kucman2024-11-082-0/+52
| | | | | | | | Add print IMSM license for VMD controllers in --detail-platform. The license specifies the scope of RAID support in the platform for the VMD controller. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
* tests: remove --autoMariusz Tkaczyk2024-11-073-8/+8
| | | | | | It is deprecated and it is not tested now. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdopen: remove wrong conditionMariusz Tkaczyk2024-11-061-5/+0
| | | | | | | | | | After mentioned patch, this condition get opposite meaning and it is blocking creation in cases where it was supported. Remove it now. Fixes: 119cdcad049e ("mdadm: drop auto= support") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm.conf: remove refferences to old kernels.Mariusz Tkaczyk2024-11-051-1/+1
| | | | | | Remove them. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* md.man: Remove refferences to not supported kernelMariusz Tkaczyk2024-11-051-43/+12
| | | | | | Reader doesn't need it. Remove it. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm.man: Remove refferences to legacy kernelsMariusz Tkaczyk2024-11-051-85/+18
| | | | | | | We are not supporting kernels older than 3.10. Update mdadm man. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm: drop auto= supportMariusz Tkaczyk2024-11-0512-385/+60
| | | | | | | | | | | | According to author (and what was described in man): "With mdadm 3.0, device creation is normally left up to udev so this is option is unlikely to be needed" This was a workaround for kernel 2.6 family issues (partitionable and non-partitionable arrays hell) and I believe we are far away from it now. I'm not aware of any usage of it, hence it is removed. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* ReadMe: Fix stylistic issuesMariusz Tkaczyk2024-11-053-211/+155
| | | | | | No functional changes, just adopt style to allow checkpatch to pass. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdmon: delegate removal to managemonMariusz Tkaczyk2024-11-044-41/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting from [1], kernel requires suspend lock on member drive remove path. It causes deadlock with external management because monitor thread may be locked on suspend and is unable to switch array to active, for example if badblock is reported in this time. It is blocking action now, so it must be delegated to managemon thread but we must ensure that monitor does metadata update first, just after detecting faulty. This patch adds appropriative support. Monitor thread detects "faulty", and updates the metadata. After that, it is asking manager thread to remove the device. Manager must be careful because closing descriptors used by select() may lead to abort with D_FORTIFY_SOURCE=2. First, it must ensure that device descriptors are not used by monitor. There is unlimited numer of remove retries and recovery is blocked until all failed drives are removed. It is safe because "faulty" device is not longer used by MD. Issue will be also mitigated by optimalization on badlbock recording path in kernel. It will check if device is not failed before badblock is recorded but relying on this is not ideologically correct. Userspace must keep compatibility with kernel and since it is blocking action, we must tract is as blocking action. [1] kernel commit cfa078c8b80d ("md: use new apis to suspend array for adding/removing rdev from state_store()") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* monitor: Add DS_EXTERNAL_BB flagMariusz Tkaczyk2024-11-042-21/+31
| | | | | | | | | | If this is set, then metadata handler must support external badblocks. Remove checks for superswitch functions. If mdi->state_fd is not set then we should not try to record badblock, we cannot trust this device. No functional changes. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* sysfs: add sysfs_open_memb_attr()Mariusz Tkaczyk2024-11-043-83/+67
| | | | | | | | | | Function is added to not repeat defining "dev-%s", disk_name. Related code branches are updated. Ioctl way for setting disk faulty/remove is removed, sysfs is always used now. Some non functional style issues are fixed in Manage_subdevs(). Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* [PATCH] mdadm: Grow.c distinguish takeover vs reshape on grow operationNigel Croxon2024-10-281-1/+2
| | | | | | | | Correcting the terminology on the output when doing a takeover vs a reshape. Signed-off-by: Nigel Croxon <ncroxon@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
* mdadm/Grow: Check new_level interface rather than kernel versionXiao Ni2024-10-181-1/+1
| | | | | | | | Different os distributions have different kernel version themselves. Check new_level sysfs interface rather than kernel version. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Manage: Clear superblock if adding new device failsXiao Ni2024-10-181-0/+4
| | | | | | | | The superblock is kept if adding new device fails. It should clear the superblock if it fails to add a new disk. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* util: use only /dev directory in open_dev()Kinga Stefaniuk2024-10-161-11/+0
| | | | | | | | | Previously, open_dev() tried to open device in two ways - using /dev and /tmp directory. This method could be used by users which have no access to /tmp directory (e.g. udev) and dev_open() fails which may affect many processes. Remove try to open in /tmp directory. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* mdadm.man: Add udev-rules flagAndre Paiusco2024-10-161-0/+10
| | | | | | | --udev-rules flag is added and point to mdadm.conf man page for further explanations about POLICY. Signed-off-by: Andre Paiusco <github@paiusco.org>
* mdadm.conf.man: Explain udev ruleAndre Paiusco2024-10-161-10/+14
| | | | | | | | | Clarify a filename is accepted and the need of reloading the udev rules. Small correction on example order. Signed-off-by: Andre Paiusco <github@paiusco.org>
* mdadm: Add mdadm_status.hAnna Sztukowska2024-10-103-8/+16
| | | | | | | Move mdadm_status_t to mdadm_status.h file. Add status for memory allocation failure. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* mdadm.man: elaborate more about mdmonitor.serviceMariusz Tkaczyk2024-10-102-29/+34
| | | | | | Describe how it behaves and how it can be configured to work. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdmonitor: Abandon custom configuration filesMariusz Tkaczyk2024-10-103-53/+15
| | | | | | | | | | | | | | | Operating system vendors are customizing mdmonitor service beacause the default form is not satifying for them (expect SUSE). As a result, support is complicated (maintainers have to check the system) and man page is not detailed. I propose to abandon custom configuration files via sysconfig and keep it inside mdadm.conf only. Detailed comment in service for OSV maintainers is added to help with transition. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* super-intel: move scsi_get_serial from sg_ioKinga Stefaniuk2024-10-083-66/+45
| | | | | | | scsi_get_serial() function is used only by super-intel.c. Move function to this file and remove sg_io.c file. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* Rename Monitor.c to mdmonitor.cKinga Stefaniuk2024-10-072-1/+1
| | | | | | | Rename Monitor.c to mdmonitor.c to avoid errors during compilation on case-insensitive filesystems. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* util: fix sys_hot_remove_disk()Mariusz Tkaczyk2024-10-041-1/+1
| | | | | | | Instead of "remove", "faulty" was called. Fixes: d95edceb362a ("sysfs: add function for writing to sysfs fd") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* md.man: update refference to raid5-ppl.rstMariusz Tkaczyk2024-10-041-8/+2
| | | | | | | | | Documentation/md has moved to Documentation/driver-api/md. Update and and rework sentence. Remove refference to not supported kernel close to updated text. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm: add xmalloc.hMariusz Tkaczyk2024-09-2733-44/+100
| | | | | | | | | | Move memory declaration helpers outside mdadm.h. They seems to be useful so keep them but include separatelly. Rework them to not reffer to Name[] declared internally in mdadm/mdmon. This is first step to start decomplexing mdadm.h. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* Mdmonitor: Fix startup with missing directoryAnna Sztukowska2024-09-271-7/+7
| | | | | | | | | Commit 0a07dea8d3b78 ("Mdmonitor: Refactor check_one_sharer() for better error handling") introduced an issue, if directory /run/mdadm is missing, monitor fails to start. Move the directory creation earlier to ensure it is always created. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* sysfs: add function for writing to sysfs fdMariusz Tkaczyk2024-09-276-56/+101
| | | | | | | | | | Proposed function sysfs_wrte_descriptor() unifies error handling for write() done to sysfs files. Main purpose is to use it with MD sysfs file but it can be used elsewhere. No functional changes. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* Incremental: Rename IncrementalRemoveMariusz Tkaczyk2024-09-273-5/+4
| | | | | | | Rename it to Incremental_remove for better readability. No functional changes. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* CI: do not install unnecessary packagesKinga Stefaniuk2024-09-261-2/+2
| | | | | | | Updating all of the packages every time is not needed and costs a lot of resources. Install only necessary packages and their dependencies. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* Remove INSTALL and dev/nullMariusz Tkaczyk2024-09-232-13/+0
| | | | | | | | | INSTALL is not needed because it added to README.md dev/null was created accidentally. Remove them. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Manage: record errnoXiao Ni2024-09-231-3/+5
| | | | | | | | | Sometimes it reports: mdadm: failed to stop array /dev/md0: Success It's the reason the errno is reset. So record errno during the loop. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: remove 09imsm-assemble.brokenXiao Ni2024-09-231-6/+0
| | | | | | | 09imsm-assemble can run successfully. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: 07testreshape5 fixXiao Ni2024-09-232-12/+1
| | | | | | | Init dir to avoid test failure. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: Remove 07reshape5intr.brokenXiao Ni2024-09-231-45/+0
| | | | | | | 07reshape5intr can run successfully now. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: 07changelevels fixXiao Ni2024-09-234-24/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are five changes to this case. 1. remove testdev check. It can't work anymore and check if it's a block device directly. 2. It can't change level and chunk size at the same time 3. Sleep more than 10s before check wait. The test devices are small. Sometimes it can finish so quickly once the reshape just starts. mdadm will be stuck before it waits reshape to start. So the sync speed is limited. And it restores the sync speed when it waits reshape to finish. It's good for case without backup file. It uses systemd service mdadm-grow-continue to monitor reshape progress when specifying backup file. If reshape finishes so quickly before it starts monitoring reshape progress, the daemon will be stuck too. Because reshape_progress is 0 which means the reshape hasn't been started. So give more time to let service can get right information from kernel space. But before getting these information. It needs to suspend array. At the same time the reshape is running. The kernel reshape daemon will update metadata 10s. So it needs to limit the sync speed more than 10s before restoring sync speed. Then systemd service can suspend array and start monitoring reshape progress. 4. Wait until mdadm-grow-continue service exits mdadm --wait doesn't wait systemd service. For the case that needs backup file, systemd service deletes the backup file after reshape finishes. In this test case, it runs next case when reshape finishes. And it fails because it can't create backup file because the backup file exits. 5. Don't reshape from raid5 to raid1. It can't work now. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: wait until level changesXiao Ni2024-09-231-0/+4
| | | | | | | | | check wait waits reshape finishes, but it doesn't wait level changes. The level change happens in a forked child progress. So we need to search the child progress and monitor it. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Grow: sleep a while after removing disk in impose_levelXiao Ni2024-09-231-0/+7
| | | | | | | | | | | It needs to remove disks when reshaping from raid456 to raid0. In kernel space it sets MD_RECOVERY_RUNNING. And it will fail to change level. So wait sometime to let md thread to clear this flag. This is found by test case 05r6tor0. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>