Skip to content

Conversation

@pcahyna
Copy link
Member

@pcahyna pcahyna commented Jun 4, 2021

Pull Request Details:

In the case of reproducers I chceked that other directories under the mountpoint do not get removed. In the case of no reproducers I checked that the proces completes successfully, and also that the build dir is correctly removed and that ReaR does not create and remove useless $BUILD_DIR/outputfs with URL schemes that do not need it (rsync://, sftp://).

  • Brief description of the changes in this pull request:

Cleanup of temporary mount point handling, particularly for output. Unification of mount point umount and cleanup - move to the mount_url() and umount_url() functions, callers do not need to worry about it, resulting in huge deduplication (demultiplication), this concerns both the normal path and the error path (exit tasks). Replaced the various rm -rf of the mountpoint by rmdir (this fixes #2611) and added lazy umount in case normal umount does not succeed. If build dir is kept, propose a safe way to remove it to the user (rm -Rf --one-file-system instead of just rm -Rf where the user would risk to remove everything in their mounted filesystem if still mounted.)

Stop creating the empty mountpoint ($BUILD_DIR/outputfs) for schemes that do not need to mount anything (rsync:// or anything "ftp-like".)

Fixes also some other bugs noted in the process: filesystem-specific umount command not called (20359a9#r51319634) , unknown schemes considered invalid (noted in discussion under #932).

Identical scripts under DUPLICITY and YUM were replaced by symlinks to increase code sharing.

Reverts #782 (the change that introduced bug #2611) and #578 - it is not clear how .lockfile can exist in the unmounted filesystem, and if it does, it is a bug.

pcahyna added 29 commits June 4, 2021 14:59
The "fix" actually has not fixed anything (rear#781 was a different issue)
and reintroduces bug rear#465 (re-reported as rear#2611). The issue is that if
the directories are not empty, it is because they contain potentially
valuable data (old backups and backups from other machines), so removing
them in order to silence some error is the worst thing to do.

This reverts commit 4e9c2a1.
…-YUM files to copies of those files"

Files that did not differ from their originals and have not gained
significant updates afterwards are better kept as symlinks to reduce
duplication and maintenance effort.

This reverts commit 3d9cc24
except for usr/share/rear/restore/YUM/default/970_set_root_password.sh
Introduced as a copy in e68326f.

No functional change.
Replace identical script copies by symlinks.
$BUILD_DIR/outputfs/$RSYNC_PREFIX has never been used by rsync. Removing
it is even dangerous, because usually $RSYNC_PREFIX == $NETFS_PREFIX ==
$OUTPUT_PREFIX, so this would lead to a loss of data if the unmount
fails. It could even remove all the backups if $RSYNC_PREFIX is empty.
NETFS_PREFIX has not been used in the output stage since
a8fdc44. This was the last instance.
Checks whether a given URL scheme supports direct filesystem access when
mounted.
obdr is similar to tape, so it should be in the same code branch.

At this time mostly academic since OBDR support was most likely broken
by isofs changes in 4ef0f30 (ppc64le), 3658b79 (ppc64) and 41efc97
(i386).
Space-separate commands from the enclosing parens.
Use Error instead of StopIfError which is problematic for several
reasons (see the comments near its definition).

Error also shows last error messages from the failed commands.

Analogous to changes in PR rear#1877.
Since a8fdc44 the handling of
BACKUP_URL and OUTPUT_URL is separate and the code is not shared between
backup and output stages. Also, the code that handles the lockfile is
different anyway. This allows some simplification by removing the
indirection based on the current stage that originally allowed sharing
the code.

Making this code generic and shared between the stages might be a good
idea, but would have to be done a bit differently.
The PXE output (except for the legacy support) needs direct filesystem
access to OUTPUT_URL. Therefore complain if OUTPUT_URL can not be
mounted.
Instead of testing whether the path returned by output_path() is empty,
use the new dedicated scheme_supports_filesystem() function for that.
This is less error prone, because it is easy to ignore the empty string
returned from output_path() and accidentally create files in the root
directory.

Functional change: output schemes that do not support mounting (anything
"ftp-like" or "tape-like") will stop creating an empty directory
$BUILD_DIR/outputfs.
Determines whether one can upload files to a URL.

True for most schemes except for null and tape-based schemes (tape and
obdr).

Does not tell how exactly one is supposed to upload files to the URL:
it could be via a local mountpoint (mounted by mount_url) or by using a
file transfer utility for URLs that can not be mounted (usually lftp or
rsync). Use scheme_supports_filesystem() to distinguish between the two
cases.
Replaces a hardcoded list of schemes that do not support uploading files
and can be skipped. Also stop relying on output_path() returning an
empty string for paths that do not accept files and move the
output_path() call down to the place where we actually know that we will
use the result.

Functional change: if the OUTPUT method has produced files but we don't
know how to upload them, abort. This prevents the output from being
silently lost in case the selected scheme does not accept them. This
will affect the tape scheme. The only exception is null, because it
indicates that the user does not expect the files to be uploaded
anywhere.
${array[*]} instead of "${array[@]}" was indeed intended. Those uses are
inside bigger quoted strings and ${array[@]} would split them into
multiple strings.

Use ${array[*]} in other places where it was not originally intended,
but should have been.
"Wrong" here means schemes that do not support filesystem access. Since
this function returns a filesystem path that can be used to access the
URL destination directly, it is crucial that this is actually supported.

Returning an empty string for them is not satisfactory: it could lead to
caller putting its files under / instead of the intended location if the
result is not checked for emptiness.

Returning ${BUILD_DIR}/outputfs/${OUTPUT_PREFIX} even for unmountable
URLs (as done before) is also not satisfactory: caller could put its
files there expecting them to be safely at their destination, but if the
directory is not a mountpoint, they would get silently lost.

All callers have been modified to avoid calling this function if it
would abort.
After umount_url, this directory should not be there. Removing it is
even dangerous, because it could lead to a loss of data if the unmount
fails. It could even remove all the outputs if $OUTPUT_PREFIX happens to
be empty.
This function has multiple problems, see comments in code - better to
use explicit tests and Error.
There are many places where a temporary mount point is being removed, so
introduce a shared function for it, even if it is a tiny one-liner.
All callers of mount_url() create a temporary mount point for it, so we
can move the mount point creation there, together with all the exit task
logic, and the corresponding removal into umount_url().

Functional change: mount point does not get created and removed for URLs
that do not need to be mounted. This mount point would be unused and if
not, it would be a bug (nobody would save files that appear there, so
they would get lost).

Bug fix: call the filesystem-specific umount handling for davfs and
sshfs (it used to be ignored).
If scheme_supports_filesystem returns false, the URL is definitely not
mountable. Use this instead of keepeing a long list of unmountable URL
schemes in mount_url() and umount_url().

There are some URLs that allow filesystem accces even if we don't
mount and umount them (file: and sometimes iso: ). They need to be
handled as special case. Still, this reduces the amount of hardcoded
schemes quite a lot.
pcahyna added 4 commits June 18, 2021 13:58
This ensures that a filesystem-specific unmount command is invoked
instead of the generic `umount`, if necessary. It also makes
improvements like lazy unmount easy to consistently use in all places
that benefit from them.

Does not work for the var:// scheme, but this has been broken anyway
(by the "fix" for rear#170, see the discussion for commit
20359a9, and custom umount command has
never been used in exit tasks.)
Does it need fusermount -u for umount? Seems similar to the sshfs case,
but does not get the same special handling.
Temporary mountpoint for output is now handled internally in
mount_url()/umount_url().

Fixes a possible data loss: the exit task was using rm -Rf, which would
remove the output files for other machines, if the same tftpboot
directory is shared among multiple machines (and ReaR encourages this,
since it uses a unique prefix to identify backups from different
machines) and unmounting it fails. The shared implementation of exit
tasks now uses rmdir instead of rm -r.
Instead of special code in cleanup_build_area_and_end_program(), use the
shared helper to remove $BUILD_DIR/outputfs. This translates to using
rmdir always instead of using rm -Rf in case the filesystem is not
mounted. This is simpler and safer (disaster could arise if the
mountpoint command fails for any unexpected reason, for example).

It intentionally reverts PR rear#578 (commit
606d8ce). It is not clear why was
.lockfile present in the unmounted outputfs (this was the reason for the
change). According to my understanding of the code, this should not
happen. Even if this happens, it means that some part of the code is
using outputfs incorrectly (it should be used only if mounted, otherwise
data put there are lost) and it is better to be warned about the
problem rather than ignoring it by blindly removing the whole directory.
@pcahyna pcahyna force-pushed the fix-backup-removal-in-exit-task-updated branch from 7c66c68 to 0515eaf Compare June 18, 2021 11:59
@pcahyna
Copy link
Member Author

pcahyna commented Jun 18, 2021

rebased again to fix a typo in commit message

@pcahyna
Copy link
Member Author

pcahyna commented Jun 18, 2021

I thought that OBDR was HP-specific, but maybe it got implemented by other vendors as well.

@jsmeix
Copy link
Member

jsmeix commented Jun 18, 2021

If there are no objections right now
I would like to merge it in about half an hour.

@jsmeix
Copy link
Member

jsmeix commented Jun 18, 2021

When OBDR is/was HP-specific it explains why IBM documents don't mention it.

pcahyna added 9 commits June 18, 2021 14:12
When the build area is left for debugging purposes, we tell the user
that they should remove it later. Since the output filesystem with
valuable old backups or backups from other machines may be still mounted
at $BUILD_DIR/outputfs, tell the user to use rm -Rf --one-file-system
instead of just rm -Rf. This prevents data loss if the user follows the
intruction literally.

All supported distributions now support the --one-file-system option, so
there should be no problem with this.
umount_mountpoint_lazy() was introduced for this purpose earlier, so use
it also in cleanup_build_area_and_end_program().

Note: the cleanup in cleanup_build_area_and_end_program() duplicates the
exit tasks used by mount_url()/umount_url(). This has been the case for
a long time and should not be a real problem, althought it would be
probably safe to remove this cleanup from
cleanup_build_area_and_end_program() and leave only those in
mount_url()/umount_url().
Instead of hardcoding a list of output URL schemes that can be accessed
as a filesystem when copying result files to their destination, use the
scheme_supports_filesystem() function that tells us exactly this
information. This extends supported schemes to anything that can be
mounted by mount_url().
If the temporary mountpoint does not exist, we should be happy, since
there is nothing to do. Unfortunately rmdir (unlike rm -Rf) reports an
error in this case, so test for the directory existence before.
This used to be the case before. This case is encountered without any
local configuration, because by default BACKUP_URL is empty and
OUTPUT_URL is equal to BACKUP_URL. It might be better to use
OUTPUT_URL=null as the default value instead of empty, but for now
preserve the previous behavior.
Apparently it is not needed since rear#2416.
Functions that optionally perform lazy umount now require that the
optional argument is either empty or equal to "lazy". An arbitrary
nonempty string is not enough and is now considered a bug. This improves
readability. Suggestion by @jsmeix.
This reverts commit d850c40.

The script probably made sense when it was submitted, but in the
meantime generic handling of transfer of output files got implemented
( a8fdc44 ), which made it redundant when it was committed.
@pcahyna pcahyna force-pushed the fix-backup-removal-in-exit-task-updated branch from 0515eaf to 2b56f38 Compare June 18, 2021 12:13
@pcahyna
Copy link
Member Author

pcahyna commented Jun 18, 2021

I pushed again to fix another tiny commit message mistake, no functional difference.

@gdha
Copy link
Member

gdha commented Jun 18, 2021

@pcahyna Yes, OBDR is HPE specific - see the included PDF file.

I thought that OBDR was HP-specific, but maybe it got implemented by other vendors as well.

Compatibility_Matrices.pdf

That being said we never got a complaint it wasn't working (and I'm pretty sure it is broken in the meantime) - IMHO I'm in favor of removing it altogether. @dagwieers Do you mind if it was gone as you were the only one ever used by my knowledge?

@jsmeix
Copy link
Member

jsmeix commented Jun 18, 2021

@pcahyna
again many thanks for your huge bugfix and cleanup work!

I wish you and all ReaR contributors and users
a relaxed and recovering weekend!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ReaR configured with netfs backup can delete other/old folders on NFS at failure

4 participants