Fix backup removal in exit task and cleanup handling of URL mountpoints #2625

pcahyna · 2021-06-04T18:08:37Z

Pull Request Details:

Type: Bug Fix / Cleanup
Impact: Critical
Reference to related issue (URL): ReaR configured with netfs backup can delete other/old folders on NFS at failure #2611 Fixes #781 Fix for removing directory in case that they are not empty #782 rear configured with netfs backup deletes other folders on NFS at failure #465
How was this pull request tested?
On CentOS 7 or CentOS 8:
- backup and restore with backup on ISO and ISO stored on OUTPUT_URL=sftp://
- backup to NFS with several reproducers for ReaR configured with netfs backup can delete other/old folders on NFS at failure #2611 :
  - overriding umount by a function that does nothing and inserting a script that kills ReaR before unmounting URLs, thus triggering the exit tasks
    - both in backup and output stages (bug was present only in the output stage)
  - inserting a script that blocks umount by spawning a long running process with its cwd inside the temporary mountpoint $BUILD_DIR/outputfs
    - both in backup and output stages (bug was present only in the output stage)
  - overriding umount by a function that does nothing and returns an error
    - only in the output stage
- backup to NFS without any bug reproducer
- complete backup and restore with rsync over ssh (BACKUP=RSYNC)
- backup to sshfs (BACKUP=NETFS BACKUP_URL=sshfs://...)
- mkrescue with OUTPUT=PXE and several reproducers
  PXE_TFTP_URL: nfs://..., PXE_CONFIG_URL: nfs://. Reproducers consist of overriding umount by a function that does nothing and returns an error
  - before 800_copy_to_tftp.sh
  - before 810_create_pxelinux_cfg.sh
- mkrescue with OUTPUT=PXE and no bug reproducer
  PXE_TFTP_URL: nfs://..., PXE_CONFIG_URL: nfs://

In the case of reproducers I chceked that other directories under the mountpoint do not get removed. In the case of no reproducers I checked that the proces completes successfully, and also that the build dir is correctly removed and that ReaR does not create and remove useless $BUILD_DIR/outputfs with URL schemes that do not need it (rsync://, sftp://).

Brief description of the changes in this pull request:

Cleanup of temporary mount point handling, particularly for output. Unification of mount point umount and cleanup - move to the mount_url() and umount_url() functions, callers do not need to worry about it, resulting in huge deduplication (demultiplication), this concerns both the normal path and the error path (exit tasks). Replaced the various rm -rf of the mountpoint by rmdir (this fixes #2611) and added lazy umount in case normal umount does not succeed. If build dir is kept, propose a safe way to remove it to the user (rm -Rf --one-file-system instead of just rm -Rf where the user would risk to remove everything in their mounted filesystem if still mounted.)

Stop creating the empty mountpoint ($BUILD_DIR/outputfs) for schemes that do not need to mount anything (rsync:// or anything "ftp-like".)

Fixes also some other bugs noted in the process: filesystem-specific umount command not called (20359a9#r51319634) , unknown schemes considered invalid (noted in discussion under #932).

Identical scripts under DUPLICITY and YUM were replaced by symlinks to increase code sharing.

Reverts #782 (the change that introduced bug #2611) and #578 - it is not clear how .lockfile can exist in the unmounted filesystem, and if it does, it is a bug.

The "fix" actually has not fixed anything (rear#781 was a different issue) and reintroduces bug rear#465 (re-reported as rear#2611). The issue is that if the directories are not empty, it is because they contain potentially valuable data (old backups and backups from other machines), so removing them in order to silence some error is the worst thing to do. This reverts commit 4e9c2a1.

…-YUM files to copies of those files" Files that did not differ from their originals and have not gained significant updates afterwards are better kept as symlinks to reduce duplication and maintenance effort. This reverts commit 3d9cc24 except for usr/share/rear/restore/YUM/default/970_set_root_password.sh

No functional change.

Introduced as a copy in e68326f. No functional change.

Replace identical script copies by symlinks.

$BUILD_DIR/outputfs/$RSYNC_PREFIX has never been used by rsync. Removing it is even dangerous, because usually $RSYNC_PREFIX == $NETFS_PREFIX == $OUTPUT_PREFIX, so this would lead to a loss of data if the unmount fails. It could even remove all the backups if $RSYNC_PREFIX is empty.

NETFS_PREFIX has not been used in the output stage since a8fdc44. This was the last instance.

Use the function keyword to declare functions. https://github.com/rear/rear/wiki/Coding-Style#functions

Checks whether a given URL scheme supports direct filesystem access when mounted.

obdr is similar to tape, so it should be in the same code branch. At this time mostly academic since OBDR support was most likely broken by isofs changes in 4ef0f30 (ppc64le), 3658b79 (ppc64) and 41efc97 (i386).

Space-separate commands from the enclosing parens.

Use Error instead of StopIfError which is problematic for several reasons (see the comments near its definition). Error also shows last error messages from the failed commands. Analogous to changes in PR rear#1877.

Since a8fdc44 the handling of BACKUP_URL and OUTPUT_URL is separate and the code is not shared between backup and output stages. Also, the code that handles the lockfile is different anyway. This allows some simplification by removing the indirection based on the current stage that originally allowed sharing the code. Making this code generic and shared between the stages might be a good idea, but would have to be done a bit differently.

The PXE output (except for the legacy support) needs direct filesystem access to OUTPUT_URL. Therefore complain if OUTPUT_URL can not be mounted.

Instead of testing whether the path returned by output_path() is empty, use the new dedicated scheme_supports_filesystem() function for that. This is less error prone, because it is easy to ignore the empty string returned from output_path() and accidentally create files in the root directory. Functional change: output schemes that do not support mounting (anything "ftp-like" or "tape-like") will stop creating an empty directory $BUILD_DIR/outputfs.

Determines whether one can upload files to a URL. True for most schemes except for null and tape-based schemes (tape and obdr). Does not tell how exactly one is supposed to upload files to the URL: it could be via a local mountpoint (mounted by mount_url) or by using a file transfer utility for URLs that can not be mounted (usually lftp or rsync). Use scheme_supports_filesystem() to distinguish between the two cases.

Replaces a hardcoded list of schemes that do not support uploading files and can be skipped. Also stop relying on output_path() returning an empty string for paths that do not accept files and move the output_path() call down to the place where we actually know that we will use the result. Functional change: if the OUTPUT method has produced files but we don't know how to upload them, abort. This prevents the output from being silently lost in case the selected scheme does not accept them. This will affect the tape scheme. The only exception is null, because it indicates that the user does not expect the files to be uploaded anywhere.

${array[*]} instead of "${array[@]}" was indeed intended. Those uses are inside bigger quoted strings and ${array[@]} would split them into multiple strings. Use ${array[*]} in other places where it was not originally intended, but should have been.

"Wrong" here means schemes that do not support filesystem access. Since this function returns a filesystem path that can be used to access the URL destination directly, it is crucial that this is actually supported. Returning an empty string for them is not satisfactory: it could lead to caller putting its files under / instead of the intended location if the result is not checked for emptiness. Returning ${BUILD_DIR}/outputfs/${OUTPUT_PREFIX} even for unmountable URLs (as done before) is also not satisfactory: caller could put its files there expecting them to be safely at their destination, but if the directory is not a mountpoint, they would get silently lost. All callers have been modified to avoid calling this function if it would abort.

After umount_url, this directory should not be there. Removing it is even dangerous, because it could lead to a loss of data if the unmount fails. It could even remove all the outputs if $OUTPUT_PREFIX happens to be empty.

This function has multiple problems, see comments in code - better to use explicit tests and Error.

There are many places where a temporary mount point is being removed, so introduce a shared function for it, even if it is a tiny one-liner.

All callers of mount_url() create a temporary mount point for it, so we can move the mount point creation there, together with all the exit task logic, and the corresponding removal into umount_url(). Functional change: mount point does not get created and removed for URLs that do not need to be mounted. This mount point would be unused and if not, it would be a bug (nobody would save files that appear there, so they would get lost). Bug fix: call the filesystem-specific umount handling for davfs and sshfs (it used to be ignored).

If scheme_supports_filesystem returns false, the URL is definitely not mountable. Use this instead of keepeing a long list of unmountable URL schemes in mount_url() and umount_url(). There are some URLs that allow filesystem accces even if we don't mount and umount them (file: and sometimes iso: ). They need to be handled as special case. Still, this reduces the amount of hardcoded schemes quite a lot.

This ensures that a filesystem-specific unmount command is invoked instead of the generic `umount`, if necessary. It also makes improvements like lazy unmount easy to consistently use in all places that benefit from them. Does not work for the var:// scheme, but this has been broken anyway (by the "fix" for rear#170, see the discussion for commit 20359a9, and custom umount command has never been used in exit tasks.)

Does it need fusermount -u for umount? Seems similar to the sshfs case, but does not get the same special handling.

Temporary mountpoint for output is now handled internally in mount_url()/umount_url(). Fixes a possible data loss: the exit task was using rm -Rf, which would remove the output files for other machines, if the same tftpboot directory is shared among multiple machines (and ReaR encourages this, since it uses a unique prefix to identify backups from different machines) and unmounting it fails. The shared implementation of exit tasks now uses rmdir instead of rm -r.

Instead of special code in cleanup_build_area_and_end_program(), use the shared helper to remove $BUILD_DIR/outputfs. This translates to using rmdir always instead of using rm -Rf in case the filesystem is not mounted. This is simpler and safer (disaster could arise if the mountpoint command fails for any unexpected reason, for example). It intentionally reverts PR rear#578 (commit 606d8ce). It is not clear why was .lockfile present in the unmounted outputfs (this was the reason for the change). According to my understanding of the code, this should not happen. Even if this happens, it means that some part of the code is using outputfs incorrectly (it should be used only if mounted, otherwise data put there are lost) and it is better to be warned about the problem rather than ignoring it by blindly removing the whole directory.

pcahyna · 2021-06-18T11:59:43Z

rebased again to fix a typo in commit message

pcahyna · 2021-06-18T12:00:53Z

I thought that OBDR was HP-specific, but maybe it got implemented by other vendors as well.

jsmeix · 2021-06-18T12:03:50Z

If there are no objections right now
I would like to merge it in about half an hour.

jsmeix · 2021-06-18T12:05:03Z

When OBDR is/was HP-specific it explains why IBM documents don't mention it.

When the build area is left for debugging purposes, we tell the user that they should remove it later. Since the output filesystem with valuable old backups or backups from other machines may be still mounted at $BUILD_DIR/outputfs, tell the user to use rm -Rf --one-file-system instead of just rm -Rf. This prevents data loss if the user follows the intruction literally. All supported distributions now support the --one-file-system option, so there should be no problem with this.

umount_mountpoint_lazy() was introduced for this purpose earlier, so use it also in cleanup_build_area_and_end_program(). Note: the cleanup in cleanup_build_area_and_end_program() duplicates the exit tasks used by mount_url()/umount_url(). This has been the case for a long time and should not be a real problem, althought it would be probably safe to remove this cleanup from cleanup_build_area_and_end_program() and leave only those in mount_url()/umount_url().

Instead of hardcoding a list of output URL schemes that can be accessed as a filesystem when copying result files to their destination, use the scheme_supports_filesystem() function that tells us exactly this information. This extends supported schemes to anything that can be mounted by mount_url().

If the temporary mountpoint does not exist, we should be happy, since there is nothing to do. Unfortunately rmdir (unlike rm -Rf) reports an error in this case, so test for the directory existence before.

This used to be the case before. This case is encountered without any local configuration, because by default BACKUP_URL is empty and OUTPUT_URL is equal to BACKUP_URL. It might be better to use OUTPUT_URL=null as the default value instead of empty, but for now preserve the previous behavior.

Apparently it is not needed since rear#2416.

@jsmeix

Functions that optionally perform lazy umount now require that the optional argument is either empty or equal to "lazy". An arbitrary nonempty string is not enough and is now considered a bug. This improves readability. Suggestion by @jsmeix.

This reverts commit d850c40. The script probably made sense when it was submitted, but in the meantime generic handling of transfer of output files got implemented ( a8fdc44 ), which made it redundant when it was committed.

Fixes mistaken commit 27de4cd in PR rear#2608

pcahyna · 2021-06-18T12:13:37Z

I pushed again to fix another tiny commit message mistake, no functional difference.

gdha · 2021-06-18T12:27:20Z

@pcahyna Yes, OBDR is HPE specific - see the included PDF file.

I thought that OBDR was HP-specific, but maybe it got implemented by other vendors as well.

Compatibility_Matrices.pdf

That being said we never got a complaint it wasn't working (and I'm pretty sure it is broken in the meantime) - IMHO I'm in favor of removing it altogether. @dagwieers Do you mind if it was gone as you were the only one ever used by my knowledge?

jsmeix · 2021-06-18T13:05:48Z

@pcahyna
again many thanks for your huge bugfix and cleanup work!

I wish you and all ReaR contributors and users
a relaxed and recovering weekend!

pcahyna added 29 commits June 4, 2021 14:59

Change YUM symlinks to more appropriate targets

eeed37d

No functional change.

Convert a YUM script into a symlink

ea13012

Introduced as a copy in e68326f. No functional change.

Reduce duplication in DUPLICITY

696a4cb

Replace identical script copies by symlinks.

Replace NETFS_PREFIX by OUTPUT_PREFIX in output

4b39c5c

NETFS_PREFIX has not been used in the output stage since a8fdc44. This was the last instance.

Apply Coding-Style to function declarations

b9364e8

Use the function keyword to declare functions. https://github.com/rear/rear/wiki/Coding-Style#functions

Add the scheme_supports_filesystem helper function

00d2d3e

Checks whether a given URL scheme supports direct filesystem access when mounted.

Add obdr as output scheme w/o filesystem access

d168728

obdr is similar to tape, so it should be in the same code branch. At this time mostly academic since OBDR support was most likely broken by isofs changes in 4ef0f30 (ppc64le), 3658b79 (ppc64) and 41efc97 (i386).

Apply Coding-Style to command substitutions

a6613e5

Space-separate commands from the enclosing parens.

Fix typo in comment

f99851d

Show last error messages using Error

458c2b5

Use Error instead of StopIfError which is problematic for several reasons (see the comments near its definition). Error also shows last error messages from the failed commands. Analogous to changes in PR rear#1877.

Update file name in comment

fa6a1f9

Remove superfluous declaration of unused variable

afccd1c

Protect against unsupported schemes in PXE output

6ce72b2

The PXE output (except for the legacy support) needs direct filesystem access to OUTPUT_URL. Therefore complain if OUTPUT_URL can not be mounted.

Proper test for empty RESULT_FILES array

07ece1f

Remove removal of outputfs/$OUTPUT_PREFIX

b5da190

After umount_url, this directory should not be there. Removing it is even dangerous, because it could lead to a loss of data if the unmount fails. It could even remove all the outputs if $OUTPUT_PREFIX happens to be empty.

Replace one instance of StopIfError

0fc7dcc

This function has multiple problems, see comments in code - better to use explicit tests and Error.

Untabify umount_url and fix misleading indent

1589225

Introduce function remove_temporary_mountpoint()

1efcf70

There are many places where a temporary mount point is being removed, so introduce a shared function for it, even if it is a tiny one-liner.

pcahyna mentioned this pull request Jun 4, 2021

error out instead of bugerror for not implemented URI schemes #932

Merged

pcahyna added 4 commits June 18, 2021 13:58

Add a comment about ftpfs

0af0d4f

Does it need fusermount -u for umount? Seems similar to the sshfs case, but does not get the same special handling.

pcahyna force-pushed the fix-backup-removal-in-exit-task-updated branch from 7c66c68 to 0515eaf Compare June 18, 2021 11:59

pcahyna added 9 commits June 18, 2021 14:12

Remove temporary mountpoint only if it exists

1e71249

If the temporary mountpoint does not exist, we should be happy, since there is nothing to do. Unfortunately rmdir (unlike rm -Rf) reports an error in this case, so test for the directory existence before.

Don't do anything special with stdout

3f75048

Apparently it is not needed since rear#2416.

Remove executable bit from a script

2b56f38

Fixes mistaken commit 27de4cd in PR rear#2608

pcahyna force-pushed the fix-backup-removal-in-exit-task-updated branch from 0515eaf to 2b56f38 Compare June 18, 2021 12:13

jsmeix merged commit 98a2240 into rear:master Jun 18, 2021

jsmeix mentioned this pull request Jun 18, 2021

ReaR configured with netfs backup can delete other/old folders on NFS at failure #2611

Closed

jsmeix added the fixed / solved / done label Jun 18, 2021

jsmeix mentioned this pull request Jun 22, 2021

Don't deprecate OBDR because it still works and is in use #2637

Closed

jsmeix mentioned this pull request Aug 31, 2021

OBDR doesnt work on DAT320 #35

Closed

jsmeix mentioned this pull request Jan 10, 2023

ReaR 2.7 on SLES15SP2: inexplicable regular umount failure #2908

Closed

jsmeix mentioned this pull request Feb 23, 2023

Deprecate MBR/BIOS boot in favour of UEFI boot? #2944

Closed

jsmeix mentioned this pull request Apr 9, 2025

Extend the umount_mountpoint function to try normal, forced and lazy umounts #3447

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix backup removal in exit task and cleanup handling of URL mountpoints #2625

Fix backup removal in exit task and cleanup handling of URL mountpoints #2625

Uh oh!

pcahyna commented Jun 4, 2021 •

edited

Loading

Uh oh!

pcahyna commented Jun 18, 2021

Uh oh!

pcahyna commented Jun 18, 2021

Uh oh!

jsmeix commented Jun 18, 2021

Uh oh!

jsmeix commented Jun 18, 2021

Uh oh!

pcahyna commented Jun 18, 2021

Uh oh!

gdha commented Jun 18, 2021 •

edited

Loading

Uh oh!

jsmeix commented Jun 18, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Fix backup removal in exit task and cleanup handling of URL mountpoints #2625

Fix backup removal in exit task and cleanup handling of URL mountpoints #2625

Uh oh!

Conversation

pcahyna commented Jun 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Details:

Uh oh!

pcahyna commented Jun 18, 2021

Uh oh!

pcahyna commented Jun 18, 2021

Uh oh!

jsmeix commented Jun 18, 2021

Uh oh!

jsmeix commented Jun 18, 2021

Uh oh!

pcahyna commented Jun 18, 2021

Uh oh!

gdha commented Jun 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jsmeix commented Jun 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pcahyna commented Jun 4, 2021 •

edited

Loading

gdha commented Jun 18, 2021 •

edited

Loading

jsmeix commented Jun 18, 2021 •

edited

Loading