cleanUrl: "sequenza-usage"
description: "Sequenza 사용법을 알아봅니다."
Sequenza is a tool to analyze genomic sequencing data from paired normal-tumor samples, including cellularity and ploidy estimation; mutation and copy number (allele-specific and total copy number) detection, quantification and visualization.
https://ars.els-cdn.com/content/image/1-s2.0-S0923753419313237-mmc2.pdf
Python 기반의 preprocessing sequenza-utils
→ R 기반의 plot 으로 분석이 진행된다. 따라서 Python 설치, R 설치 두번의 설치가 필요하다.
mamba install -c bioconda sequenza-utils r-sequenza
sequenza-utils gc_wiggle -w 50 --fasta hg38.fa -o hg38.gc50Base.wig.gz
seqz
file.sequenza-utils bam2seqz -n normal.bam -t tumor.bam --fasta hg38.fa \\
-gc hg38.gc50Base.wig.gz -o out.seqz.gz
seqz
file.sequenza-utils seqz_binning --seqz out.seqz.gz -w 50 -o out small.seqz.gz
Analysis in R (run_sequenza.R)
library(argparse)
library(sequenza)
parser = ArgumentParser()
parser$add_argument('-i', '--input', required=TRUE) # mysample.small.seqz.gz
parser$add_argument('-n', '--name', required=TRUE) # mysample
parser$add_argument('-o', '--outdir', required=TRUE) # result/mysample
args = parser$parse_args()
extracted = sequenza.extract(args$input)
CP = sequenza.fit(extracted)
sequenza.results(
sequenza.extract=extracted,
cp.table=CP,
sample.id=args$name',
out.dir=args$outdir'
)