Merged
Conversation
Add new reference files - separate files for bacteria and archaea
Previously, there was a check that looked for overlap between all ASVs in the input FASTA and the feature table, but there were downstream errors that could be caused by having duplicated sequence IDs in the input fasta. This fix adds a check that no sequence IDs in the input FASTA appear more than once.
- Zipped default file that was previously unzipped - Added metacyc reaction mapping pathways (modified from HUMAnN3) - Added line to castor_hsp.R that ensures no issues running maximum parsimony method even if edge lengths of tree have zeroes
Some new scripts have been added: - default_split.py: locations of new default files for when we're running bacteria/archaea separately - split_domains.py: functions for choosing the best domain for each sequence based on which has the lowest NSTI - pick_best_domain.py: wrapper for picking the best domain to use for each sequence when we're running bacteria/archaea separately. Note that this would be run between hsp.py with the 16S/marker gene file and running hsp.py with any other trait files - combine_domains.py: wrapper for combining functional predictions from hsp.py for when we're running multiple domains. This would be run before the metagenome_pipeline step - Functions in util.py have been added for steps like reading in and pruning the tree files
Update to add scripts for running both bacterial and archaeal predictions
Added requirement for ete3 to yaml file
Added check for when no sequences match with one of the domains
Note that this file still needs testing
Archaea reference files now work with SEPP
Added new -db flag to pathway_pipeline.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This branch:
To see a full overview of the new database and the changes made, see the Wiki page.