@lakesea
2022-03-08T01:45:18.000000Z
字数 3257
阅读 333
pipelines
In this pipeline, I will describe and introduce some based steps to call gene PAVs.
gff to bed and grep those regions with gene feature
module load bedopsgff2bed <MH63_chr_v2.gene.gff >MH63_chr_v2.gene.bed
Running Mosdepth
mosdepth -n -t 6 --thresholds 1 -b MH63_chr_v2.gene.bed B009_test B009.sorted.bam###help-n --no-per-base dont output per-base depth. skipping this output will speed execution-t --threads <threads> number of BAM decompression threads [default: 0]-b --by <bed|window> optional BED file or (integer) window-sizes.other help:-F --flag <FLAG> exclude reads with any of the bits in FLAG set [default: 1796]#Sam file flag: 1796#read unmapped (0x4)#not primary alignment (0x100)#read fails platform/vendor quality checks (0x200)#read is PCR or optical duplicate (0x400)-q --quantize <segments> write quantized output see docs for description.-Q --mapq <mapq> mapping quality threshold [default: 0]-T --thresholds <thresholds> for each interval in --by, write number of bases covered by atleast threshold bases. Specify multiple integer values separatedby ','.-h --help show help
Look at the results
zcat B009_test.regions.bed.gz |lessChr01 262169 265985 MH01g0015200 20.14Chr01 266366 266897 MH01g0015300 0.00Chr01 272359 275777 MH01g0015400 16.74Chr01 274349 276712 MH01g0015500 20.18Chr01 280060 281122 MH01g0015600 20.07Chr01 284353 284998 MH01g0015700 18.16Chr01 285481 286360 MH01g0015800 34.02Chr01 288457 324059 MH01g0015900 18.84
notes:
zcat IRIS_313-8392_gene_pav.regions.bed.gz |awk '{if ($5>= 1) print $6=1; else if ($5 <1) print $6=0;}'for i in `cat selected.lines.list`;do zcat ${i}.remark.bam_pav.thresholds.bed.gz |grep -v "#"|awk '{$6 = $5/($3 - $2)}1'| awk '{if ($6>= 0.6) print $7=1; else if ($6 <0.6) print $7=0;}' >> ${i}_pav_dec.txt;done
install
git clone --recurse-submodules https://github.com/fbreitwieser/bamcovcd bamcovmakemake test
Running
$ ./bamcov/bamcov -m B009.sorted.bamChr01 (41.19Mbp)> 88.72% │ ▄████▄▄██████ ██▄██ █▄▄████████████████████▄▄│ Number of reads: 8894694> 78.87% │███████████████ █ █████ ▄█████████████████████████│ (165808 filtered)> 69.01% │███████████████ ██████████████████████████████████│ Covered bases: 39.10Mbp> 59.15% │███████████████▄██████████████████████████████████│ Percent covered: 94.94%> 49.29% │██████████████████████████████████████████████████│ Average coverage: 17.7x> 39.43% │██████████████████████████████████████████████████│ Average baseQ: 36> 29.57% │██████████████████████████████████████████████████│ Average mapQ: 49.1> 19.72% │██████████████████████████████████████████████████│> 9.86% │██████████████████████████████████████████████████│ Histo bin width: 823.7Kbp> 0.00% │██████████████████████████████████████████████████│ Histo max bin: 98.582%1 8.24M 16.47M 24.71M 32.95M 41.19M$ ./bamcov/bamcov -mr Chr01:69650-72886 B009.sorted.bamChr01 (41.19Mbp)> 90.00% │ ███ ██ █ █ █ █ █ │ Number of reads: 34> 80.00% │ ████ ██ █ █ █ █ █ │> 70.00% │ █████ ██ ▄ █ █ █ █ █ │ Covered bases: 1.1Kbp> 60.00% │ █████ ██▄ █ █ █ █ █ █ │ Percent covered: 34.72%> 50.00% │ █████ ███ █▄ █ █ ▄█ █▄ █▄│ Average coverage: 0.819x> 40.00% │ █████ ███ ██ █ █ ██ ██ ██│ Average baseQ: 34.5> 30.00% │█ █████ ███ ██ █ █ ██ ██ ██│ Average mapQ: 3.12> 20.00% │█ █████ ▄███ ██ █ █ ██ ██▄██│> 10.00% │█ █████ ████ ██ ▄██ ██ ██ █████│ Histo bin width: 64bp> 0.00% │█ █████ ████ ██ ███ ███ ██▄▄█████│ Histo max bin: 100%69.7K 70.3K 70.9K 71.6K 72.2K 72.9K
bedtools genomecov -ibam B009.sorted.bam -bg > bedtools_genomcov.out###resultsChr01 99 100 3Chr01 100 104 17Chr01 104 106 18Chr01 106 107 19Chr01 107 108 20Chr01 108 109 21Chr01 109 110 22Chr01 110 112 23Chr01 112 120 24Chr01 120 124 25