Skip to content

Closes #936 Check directories for reruns/crashed imports#937

Merged
MichaelStritt merged 7 commits intodevelopfrom
import-#936_Rerun
Nov 26, 2021
Merged

Closes #936 Check directories for reruns/crashed imports#937
MichaelStritt merged 7 commits intodevelopfrom
import-#936_Rerun

Conversation

@MichaelStritt
Copy link
Contributor

@MichaelStritt MichaelStritt commented Nov 20, 2021

Linked issue

Closes #936

@MichaelStritt MichaelStritt added the import Related to data import module label Nov 20, 2021
@MichaelStritt MichaelStritt linked an issue Nov 20, 2021 that may be closed by this pull request
@MichaelStritt MichaelStritt self-assigned this Nov 20, 2021
@MichaelStritt
Copy link
Contributor Author

@jan-petr & @HenkMutsaerts: I don't really know how this is supposed to work in some cases. For NII2BIDS it is somewhat easy, because we can just check if the rawdata ASL or T1 folder already exists and delete it for the reruns or when the pipeline crashed. For BIDS2Legacy this is problematic, because we do the import on subject level and as I said before, as long as we don't have subject directories in legacy, this is kind of impossible. For DCM2NII I know there was a previous option with bOverwrite and so on. Maybe we can use that. Other than that I don't know how I should check for existing directories if one directory can be the top of the other one (T1 on top of ASL e.g.) or if we don't exactly know all of the output scans of DCM2NII beforehand. Nevertheless, I think the implemented code is an improvement for NII2BIDS and there are some simplifications for BIDS2Legacy.

Copy link
Member

@HenkMutsaerts HenkMutsaerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the rerun, I would:

  • base rerunning and overwriting or skipping per subject only on the existence of the lock files, identical to how this is managed in the processing modules. So if a lock/status file doesn't exist, this means that the current subject wasn't correctly run, and the previous rawdata should be deleted
  • ensure that the lock/status files are only created after successful import of a single subject
  • always overwrite existing rawdata or derivatives for the current subject (since overwriting/skipping is managed by the lock/status files)

So instead of checking folders, files, what if they exist partly, etc, etc, you only rely on the lock files (unless you create a bCompletelyRerunImport parameter of course, in that case you delete all lock files). Makes the code/program also much more transparent.

@MichaelStritt
Copy link
Contributor Author

So instead of checking folders, files, what if they exist partly, etc, etc, you only rely on the lock files

And that doesn't work on your side? If one of the import steps crashes for me, then the lock file is not created and on a rerun it reruns the sub-module. I think we should make it cleaner and remove some files/folders for the import reruns, because I'm not sure if every xASL_Move command or the used tools work with overwrite, but other than that this part should be fine. Please actually describe the defect/failure behavior.

Copy link
Contributor

@jan-petr jan-petr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments only.

@MichaelStritt
Copy link
Contributor Author

Still think we should implement a sub/ses/run structure in the xASL derivatives and optimize the temp structure, so that reruns can be done in an easier way by removing pre-existing data from pipeline crashes etc.

@MichaelStritt MichaelStritt changed the title #936 Check directories for reruns/crashed imports Closes #936 Check directories for reruns/crashed imports Nov 26, 2021
@HenkMutsaerts

This comment has been minimized.

@HenkMutsaerts
Copy link
Member

Still think we should implement a sub/ses/run structure in the xASL derivatives and optimize the temp structure, so that reruns can be done in an easier way by removing pre-existing data from pipeline crashes etc.

Yep good point for version 2.0

Copy link
Member

@HenkMutsaerts HenkMutsaerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions, hope this helps you also understanding this from a user perspective, which can be very useful in software development!

@MichaelStritt MichaelStritt merged commit 54d1583 into develop Nov 26, 2021
@MichaelStritt MichaelStritt deleted the import-#936_Rerun branch November 26, 2021 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

import Related to data import module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Import module incorrectly says "has been run before"

3 participants