R/prepare_wgs_cell_line.R
prepare_wgs_cell_line.RdThis function performs part of the Battenberg WGS pipeline: Counting alleles, generating BAF and logR, reconstructing normal-pair allele counts for the cell line and performing GC content correction.
prepare_wgs_cell_line(
chrom_names,
chrom_coord,
tumourbam,
tumourname,
g1000lociprefix,
g1000allelesprefix,
gamma_ivd = 100000,
kmin_ivd = 50,
centromere_noise_seg_size = 1000000,
centromere_dist = 500000,
min_het_dist = 100000,
gamma_logr = 100,
length_adjacent = 50000,
gccorrectprefix,
repliccorrectprefix,
min_base_qual,
min_map_qual,
allelecounter_exe,
min_normal_depth,
skip_allele_counting
)A vector containing the names of chromosomes to be included
Full path to the tumour BAM file
Identifier to be used for tumour output files (i.e. the cell line BAM file name without the '.bam' extension).
Prefix path to the 1000 Genomes loci reference files
Prefix path to the 1000 Genomes SNP allele reference files
The PCF gamma value for segmentation of 1000G hetSNP IVD values (Default 1e5).
The min number of SNPs to support a segment in PCF of 1000G hetSNP IVD values (Default 50)
The maximum size of PCF segment to be removed as noise when it overlaps with the centromere due to the noisy nature of data (Default 1e6)
The minimum distance from the centromere to ignore in analysis due to the noisy nature of data in the vicinity of centromeres (Default 5e5)
The minimum distance for detecting higher resolution inter-hetSNP regions with potential LOH while accounting for inherent homozygote stretches (Default 1e5)
The PCF gamma value for confirming LOH within each inter-hetSNP candidate segment (Default 100)
The length of adjacent regions either side of a candidate inter-hetSNP LOH region to be plotted (Default 5e4)
Prefix path to GC content reference data
Prefix path to replication timing reference data (supply NULL if no replication timing correction is to be applied)
Minimum base quality required for a read to be counted
Minimum mapping quality required for a read to be counted
Path to the allele counter executable (can be found in $PATH)
Minimum depth required in the normal for a SNP to be included
Flag, set to TRUE if allele counting is already complete (files are expected in the working directory on disk)