cleanUrl: "sequenza-usage"
description: "Sequenza 사용법을 알아봅니다."

Sequenza

Sequenza is a tool to analyze genomic sequencing data from paired normal-tumor samples, including cellularity and ploidy estimation; mutation and copy number (allele-specific and total copy number) detection, quantification and visualization.

Documentation

Sequenza User Guide

Paper

https://ars.els-cdn.com/content/image/1-s2.0-S0923753419313237-mmc2.pdf

설치

Python 기반의 preprocessing sequenza-utils → R 기반의 plot 으로 분석이 진행된다. 따라서 Python 설치, R 설치 두번의 설치가 필요하다.

mamba install -c bioconda sequenza-utils r-sequenza

Quickstart

  1. Process a FASTA file to produce a GC Wiggle track file.
sequenza-utils gc_wiggle -w 50 --fasta hg38.fa -o hg38.gc50Base.wig.gz
  1. Process BAM and Wiggle files to produce a seqz file.
sequenza-utils bam2seqz -n normal.bam -t tumor.bam --fasta hg38.fa \\
	-gc hg38.gc50Base.wig.gz -o out.seqz.gz
  1. Post-process by binning the original seqz file.
sequenza-utils seqz_binning --seqz out.seqz.gz -w 50 -o out small.seqz.gz

Analysis in R (run_sequenza.R)

library(argparse)
library(sequenza)

parser = ArgumentParser()
parser$add_argument('-i', '--input', required=TRUE) # mysample.small.seqz.gz
parser$add_argument('-n', '--name', required=TRUE) # mysample
parser$add_argument('-o', '--outdir', required=TRUE) # result/mysample
args = parser$parse_args()

extracted = sequenza.extract(args$input)
CP = sequenza.fit(extracted)

sequenza.results(
	sequenza.extract=extracted,
	cp.table=CP,
	sample.id=args$name',
	out.dir=args$outdir'
)