Currently AutoHiC has 3d-dna built into the process. If you are using YaHS, SALSA, Pin_hic etc, you can extend it by following the steps below. However, the process is currently being tested and there may be some problems. I hope if you have any problems during the process, you can open a issue or contact me: [email protected]
Since some other dependencies are needed during use, we recommend using conda to prepare the environment.
conda create -n morehic -c bioconda python=3.6 matlock samtools -y
conda activate morehic
- Download the conversion script
git clone [email protected]:phasegenomics/juicebox_scripts.git
If you cannot clone it, you can get it from the link below. In the Folder
other_tools, the filename isjuicebox_scripts-master.zip.
| Google Drive (recommend) | Baidu Netdisk(百度网盘) | Quark (夸克) |
|---|---|---|
| Pre-trained model | Pre-trained model | Pre-trained model |
Since AutoHiC requires .hic and .assembly files, we have to generate them first. This process requires the use of genome files and bam files. These two files come from the custom assembly software you use.
First, generate an X file based on the genome file.
# fasta 2 apg
python3 juicebox_scripts/juicebox_scripts/makeAgpFromFasta.py test.fasta out.agp
# apg 2 asembly
python3 juicebox_scripts/juicebox_scripts/agp2assembly.py out.agp out.assembly
The path of
juicebox_scriptsmust be replaced according to the actual situation.
Use the bam file to generate the corresponding .hic file. This step requires the use of 3d-dna, which can be obtained from the link above : soft download
- If you have multiple
bamfiles, you can use the following command to merge them together
# merge bam
samtools merge merged.bam input1.bam input2.bam input3.bam
- get
.hicfile
# this step sometimes crashes on memory
matlock bam2 juicer out.bam out.links.txt
sort -k2,2 -k6,6 out.links.txt > out.sorted.links.txt
# creates .hic file
bash 3d-dna/visualize/run-assembly-visualizer.sh out.assembly out.sorted.links.txt
# The path of 3d-dna must be replaced according to the actual situation.The above steps make certain assumptions about the contents of the
bamfile. If an error is reported during the generation of theout.links.txtfile, you can use the following command
# this BAM file should represent Hi-C reads mapped against starting contigs!
samtools view -h in.bam |sed '/^[^@]/s/^\(.*\)\/[12]\t/\1\t/'|samtools view -Sb -o out.bam
samtools sort -@ 40 -n out.bam -o out.sorted.bam
If you encounter the following error, it means that your bam file does not match the newly assembled genome. You need to re-align to new genome and use the updated bam file.
temp.scaffolds_FINAL.asm_mnd.txt does not exist or does not contain any reads.
Since the current environment used by AutoHiC is incompatible, you have to create a new environment according to the AutoHiC documentation.
# clone AutoHiC
git clone https://github.com/Jwindler/AutoHiC.git
# cd AutoHiC
cd AutoHiC
# create AutoHiC env
conda env create -f autohic.yaml
# activate AutoHiC
conda activate autohic
# configuration environment
cd ./src/models/swin
# install dependencies
pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple/Now you can use onehic.py to adjust the genome based on the acquired out.assembly and out.hic files.
# Enter the AutoHiC directory.
cd /home/ubuntu/AutoHic
# run onehic
python3.9 onehic.py -hic out.hic -asy out.assembly -autohic /home/ubuntu/AutoHic -p pretrained.pth -out ./
# activate env
conda activate morehic
# get new fasta
python juicebox_assembly_converter.py -a adjusted.assembly -f genome.fasta
Since this process is currently in testing, if you have any questions, please feel free to contact me ([email protected]) and I will be happy to help.
If you used AutoHiC in your research, please cite us:
AutoHiC: a deep-learning method for automatic and accurate chromosome-level genome assembly
Zijie Jiang, Zhixiang Peng, Yongjiang Luo, Lingzi Bie, Yi Wang
bioRxiv 2023.08.27.555031; doi: https://doi.org/10.1101/2023.08.27.555031