Skip to content

Conversation

@Steffengreiner
Copy link
Contributor

@Steffengreiner Steffengreiner commented Jan 31, 2023

What was changed
This PR introduces a schema and example json for checking if the minimal required files for a nanopore data registration are within a provided dataset.
Additionally it adapts the OxfordNanoporeExperiment generation to not fail if an unknown file or folder is found but include them as OptionalFile and OptionalFolder in the datastructure to allow for variations within the provided datastructure.

Background information
As of now the minimum required folders are:

  • FASTQ_FAIL
  • FASTQ_PASS
  • FAST5_FAIL
  • FAST5_PASS

While the minimum required files are:

  • final_summary.txt (We extract metadata from this file)
  • report.md (We extract metadata from this file)
  • sequencing_summary.txt (Both Labs)

Open Questions
Due to the new constraints, the previously defined schemas and datafiles/Datafolders are ALL now unnecessary and could be deleted, since as soon as the minimal required files are provided the OxfordNanoporeExperiment is created and can be parsed.
The only reason for keeping multiple schemas could be if the distinction between the files in the OxfordnanoporeExperiment class is necessary (e.g. for accessing metadata within the files).
However this is only the case for the final_summary.txt file and the report.md file as of yet.

Steffengreiner added 2 commits January 30, 2023 17:25
Copy link
Contributor

@KochTobi KochTobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only some minor requests

Copy link
Contributor

@KochTobi KochTobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@Steffengreiner Steffengreiner merged commit d0a192a into development Feb 13, 2023
@Steffengreiner Steffengreiner deleted the feature/dm-669-valid-minimal-nanopore-datastructure branch February 13, 2023 13:58
Steffengreiner added a commit that referenced this pull request Feb 13, 2023
Introduce minimal Schema for nanopore registration  (#350)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants