The raw BAF and LogR data have been dumped into separate files. Now the data needs to be prepared to go into Impute2, which is essentially morphing it into the correct format. This function does that per chromosome and can therefore be run in parallel for each chromosome.
generate.impute.input.snp6(
infile.germlineBAF,
infile.tumourBAF,
outFileStart,
chrom,
chr_names,
problemLociFile,
snp6_reference_info_file,
imputeinfofile,
is.male,
heterozygousFilter = "none"
)Germline BAF file generated by cel2baf.logr
Tumour BAF file generated by cel2baf.logr
Prefix of the filenames where the Impute2 input will be written. These will be extended with the chromosome
Char with the chromosome for which an Impute2 file is produced
A vector of chromosome names that can be considered. This vector can just contain the chromosome for which the Impute2 file is produced, but can contain all chromosomes.
A string that points to a file with problematic loci that should be removed from the data
String to the SNP6 reference info file that comes with Battenberg SNP6
String to the impute 1000 genomes reference info file that comes with Battenberg
Boolean that is True if the donor is male, False when female
BAF cutoff for calling homozygous SNPs