-
-
Notifications
You must be signed in to change notification settings - Fork 268
Fix backup removal in exit task and cleanup handling of URL mountpoints #2625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix backup removal in exit task and cleanup handling of URL mountpoints #2625
Conversation
The "fix" actually has not fixed anything (rear#781 was a different issue) and reintroduces bug rear#465 (re-reported as rear#2611). The issue is that if the directories are not empty, it is because they contain potentially valuable data (old backups and backups from other machines), so removing them in order to silence some error is the worst thing to do. This reverts commit 4e9c2a1.
…-YUM files to copies of those files" Files that did not differ from their originals and have not gained significant updates afterwards are better kept as symlinks to reduce duplication and maintenance effort. This reverts commit 3d9cc24 except for usr/share/rear/restore/YUM/default/970_set_root_password.sh
No functional change.
Introduced as a copy in e68326f. No functional change.
Replace identical script copies by symlinks.
$BUILD_DIR/outputfs/$RSYNC_PREFIX has never been used by rsync. Removing it is even dangerous, because usually $RSYNC_PREFIX == $NETFS_PREFIX == $OUTPUT_PREFIX, so this would lead to a loss of data if the unmount fails. It could even remove all the backups if $RSYNC_PREFIX is empty.
NETFS_PREFIX has not been used in the output stage since a8fdc44. This was the last instance.
Use the function keyword to declare functions. https://github.com/rear/rear/wiki/Coding-Style#functions
Checks whether a given URL scheme supports direct filesystem access when mounted.
Space-separate commands from the enclosing parens.
Use Error instead of StopIfError which is problematic for several reasons (see the comments near its definition). Error also shows last error messages from the failed commands. Analogous to changes in PR rear#1877.
Since a8fdc44 the handling of BACKUP_URL and OUTPUT_URL is separate and the code is not shared between backup and output stages. Also, the code that handles the lockfile is different anyway. This allows some simplification by removing the indirection based on the current stage that originally allowed sharing the code. Making this code generic and shared between the stages might be a good idea, but would have to be done a bit differently.
The PXE output (except for the legacy support) needs direct filesystem access to OUTPUT_URL. Therefore complain if OUTPUT_URL can not be mounted.
Instead of testing whether the path returned by output_path() is empty, use the new dedicated scheme_supports_filesystem() function for that. This is less error prone, because it is easy to ignore the empty string returned from output_path() and accidentally create files in the root directory. Functional change: output schemes that do not support mounting (anything "ftp-like" or "tape-like") will stop creating an empty directory $BUILD_DIR/outputfs.
Determines whether one can upload files to a URL. True for most schemes except for null and tape-based schemes (tape and obdr). Does not tell how exactly one is supposed to upload files to the URL: it could be via a local mountpoint (mounted by mount_url) or by using a file transfer utility for URLs that can not be mounted (usually lftp or rsync). Use scheme_supports_filesystem() to distinguish between the two cases.
Replaces a hardcoded list of schemes that do not support uploading files and can be skipped. Also stop relying on output_path() returning an empty string for paths that do not accept files and move the output_path() call down to the place where we actually know that we will use the result. Functional change: if the OUTPUT method has produced files but we don't know how to upload them, abort. This prevents the output from being silently lost in case the selected scheme does not accept them. This will affect the tape scheme. The only exception is null, because it indicates that the user does not expect the files to be uploaded anywhere.
${array[*]} instead of "${array[@]}" was indeed intended. Those uses are
inside bigger quoted strings and ${array[@]} would split them into
multiple strings.
Use ${array[*]} in other places where it was not originally intended,
but should have been.
"Wrong" here means schemes that do not support filesystem access. Since
this function returns a filesystem path that can be used to access the
URL destination directly, it is crucial that this is actually supported.
Returning an empty string for them is not satisfactory: it could lead to
caller putting its files under / instead of the intended location if the
result is not checked for emptiness.
Returning ${BUILD_DIR}/outputfs/${OUTPUT_PREFIX} even for unmountable
URLs (as done before) is also not satisfactory: caller could put its
files there expecting them to be safely at their destination, but if the
directory is not a mountpoint, they would get silently lost.
All callers have been modified to avoid calling this function if it
would abort.
After umount_url, this directory should not be there. Removing it is even dangerous, because it could lead to a loss of data if the unmount fails. It could even remove all the outputs if $OUTPUT_PREFIX happens to be empty.
This function has multiple problems, see comments in code - better to use explicit tests and Error.
There are many places where a temporary mount point is being removed, so introduce a shared function for it, even if it is a tiny one-liner.
All callers of mount_url() create a temporary mount point for it, so we can move the mount point creation there, together with all the exit task logic, and the corresponding removal into umount_url(). Functional change: mount point does not get created and removed for URLs that do not need to be mounted. This mount point would be unused and if not, it would be a bug (nobody would save files that appear there, so they would get lost). Bug fix: call the filesystem-specific umount handling for davfs and sshfs (it used to be ignored).
If scheme_supports_filesystem returns false, the URL is definitely not mountable. Use this instead of keepeing a long list of unmountable URL schemes in mount_url() and umount_url(). There are some URLs that allow filesystem accces even if we don't mount and umount them (file: and sometimes iso: ). They need to be handled as special case. Still, this reduces the amount of hardcoded schemes quite a lot.
This ensures that a filesystem-specific unmount command is invoked instead of the generic `umount`, if necessary. It also makes improvements like lazy unmount easy to consistently use in all places that benefit from them. Does not work for the var:// scheme, but this has been broken anyway (by the "fix" for rear#170, see the discussion for commit 20359a9, and custom umount command has never been used in exit tasks.)
Does it need fusermount -u for umount? Seems similar to the sshfs case, but does not get the same special handling.
Temporary mountpoint for output is now handled internally in mount_url()/umount_url(). Fixes a possible data loss: the exit task was using rm -Rf, which would remove the output files for other machines, if the same tftpboot directory is shared among multiple machines (and ReaR encourages this, since it uses a unique prefix to identify backups from different machines) and unmounting it fails. The shared implementation of exit tasks now uses rmdir instead of rm -r.
Instead of special code in cleanup_build_area_and_end_program(), use the shared helper to remove $BUILD_DIR/outputfs. This translates to using rmdir always instead of using rm -Rf in case the filesystem is not mounted. This is simpler and safer (disaster could arise if the mountpoint command fails for any unexpected reason, for example). It intentionally reverts PR rear#578 (commit 606d8ce). It is not clear why was .lockfile present in the unmounted outputfs (this was the reason for the change). According to my understanding of the code, this should not happen. Even if this happens, it means that some part of the code is using outputfs incorrectly (it should be used only if mounted, otherwise data put there are lost) and it is better to be warned about the problem rather than ignoring it by blindly removing the whole directory.
7c66c68 to
0515eaf
Compare
|
rebased again to fix a typo in commit message |
|
I thought that OBDR was HP-specific, but maybe it got implemented by other vendors as well. |
|
If there are no objections right now |
|
When OBDR is/was HP-specific it explains why IBM documents don't mention it. |
When the build area is left for debugging purposes, we tell the user that they should remove it later. Since the output filesystem with valuable old backups or backups from other machines may be still mounted at $BUILD_DIR/outputfs, tell the user to use rm -Rf --one-file-system instead of just rm -Rf. This prevents data loss if the user follows the intruction literally. All supported distributions now support the --one-file-system option, so there should be no problem with this.
umount_mountpoint_lazy() was introduced for this purpose earlier, so use it also in cleanup_build_area_and_end_program(). Note: the cleanup in cleanup_build_area_and_end_program() duplicates the exit tasks used by mount_url()/umount_url(). This has been the case for a long time and should not be a real problem, althought it would be probably safe to remove this cleanup from cleanup_build_area_and_end_program() and leave only those in mount_url()/umount_url().
Instead of hardcoding a list of output URL schemes that can be accessed as a filesystem when copying result files to their destination, use the scheme_supports_filesystem() function that tells us exactly this information. This extends supported schemes to anything that can be mounted by mount_url().
If the temporary mountpoint does not exist, we should be happy, since there is nothing to do. Unfortunately rmdir (unlike rm -Rf) reports an error in this case, so test for the directory existence before.
This used to be the case before. This case is encountered without any local configuration, because by default BACKUP_URL is empty and OUTPUT_URL is equal to BACKUP_URL. It might be better to use OUTPUT_URL=null as the default value instead of empty, but for now preserve the previous behavior.
Apparently it is not needed since rear#2416.
Functions that optionally perform lazy umount now require that the optional argument is either empty or equal to "lazy". An arbitrary nonempty string is not enough and is now considered a bug. This improves readability. Suggestion by @jsmeix.
0515eaf to
2b56f38
Compare
|
I pushed again to fix another tiny commit message mistake, no functional difference. |
|
@pcahyna Yes, OBDR is HPE specific - see the included PDF file.
That being said we never got a complaint it wasn't working (and I'm pretty sure it is broken in the meantime) - IMHO I'm in favor of removing it altogether. @dagwieers Do you mind if it was gone as you were the only one ever used by my knowledge? |
|
@pcahyna I wish you and all ReaR contributors and users |
Pull Request Details:
Type: Bug Fix / Cleanup
Impact: Critical
Reference to related issue (URL): ReaR configured with netfs backup can delete other/old folders on NFS at failure #2611 Fixes #781 Fix for removing directory in case that they are not empty #782 rear configured with netfs backup deletes other folders on NFS at failure #465
How was this pull request tested?
On CentOS 7 or CentOS 8:
OUTPUT_URL=sftp://umountby a function that does nothing and inserting a script that kills ReaR before unmounting URLs, thus triggering the exit tasks$BUILD_DIR/outputfsumountby a function that does nothing and returns an errorBACKUP=RSYNC)BACKUP=NETFSBACKUP_URL=sshfs://...)OUTPUT=PXEand several reproducersPXE_TFTP_URL: nfs://...,PXE_CONFIG_URL: nfs://. Reproducers consist of overridingumountby a function that does nothing and returns an error800_copy_to_tftp.sh810_create_pxelinux_cfg.shOUTPUT=PXEand no bug reproducerPXE_TFTP_URL: nfs://...,PXE_CONFIG_URL: nfs://In the case of reproducers I chceked that other directories under the mountpoint do not get removed. In the case of no reproducers I checked that the proces completes successfully, and also that the build dir is correctly removed and that ReaR does not create and remove useless
$BUILD_DIR/outputfswith URL schemes that do not need it (rsync://,sftp://).Cleanup of temporary mount point handling, particularly for output. Unification of mount point umount and cleanup - move to the
mount_url()andumount_url()functions, callers do not need to worry about it, resulting in huge deduplication (demultiplication), this concerns both the normal path and the error path (exit tasks). Replaced the variousrm -rfof the mountpoint byrmdir(this fixes #2611) and added lazy umount in case normal umount does not succeed. If build dir is kept, propose a safe way to remove it to the user (rm -Rf --one-file-systeminstead of justrm -Rfwhere the user would risk to remove everything in their mounted filesystem if still mounted.)Stop creating the empty mountpoint (
$BUILD_DIR/outputfs) for schemes that do not need to mount anything (rsync://or anything "ftp-like".)Fixes also some other bugs noted in the process: filesystem-specific umount command not called (20359a9#r51319634) , unknown schemes considered invalid (noted in discussion under #932).
Identical scripts under
DUPLICITYandYUMwere replaced by symlinks to increase code sharing.Reverts #782 (the change that introduced bug #2611) and #578 - it is not clear how .lockfile can exist in the unmounted filesystem, and if it does, it is a bug.