Use a set of SNPS, insertions and deletions to modify a reference genome
genomogrify(x, var, ...)
# S4 method for XStringSet,GRanges
genomogrify(
x,
var,
alt_col = "ALT",
mask = GRanges(),
tag = NULL,
sep = "_",
var_tags = FALSE,
var_sep = "_",
verbose = TRUE,
...
)
# S4 method for BSgenome,GRanges
genomogrify(
x,
var,
alt_col = "ALT",
mask = GRanges(),
names,
tag = NULL,
sep = "_",
var_tags = FALSE,
var_sep = "_",
verbose = TRUE,
...
)
# S4 method for BSgenome,VcfFile
genomogrify(
x,
var,
alt_col = "ALT",
mask = GRanges(),
names,
tag = NULL,
sep = "_",
var_tags = FALSE,
var_sep = "_",
which,
verbose = TRUE,
...
)
# S4 method for XStringSet,VcfFile
genomogrify(
x,
var,
alt_col = "ALT",
mask = GRanges(),
tag = NULL,
sep = "_",
var_tags = FALSE,
var_sep = "_",
which,
verbose = TRUE,
...
)
A DNAStringSet or BSgenome
GRanges object containing the variants, or a VariantAnnotation::VcfFile
Passed to parallel::mclapply
The name of the column with var
containing alternate bases
Optional GRanges object defining regions to be masked with an 'N'
Optional tag to add to all sequence names which were modified
Separator to place between seqnames names & tag
logical(1) Add tags indicating which type of variant were incorporated, with 's', 'i' and 'd' representing SNPs, Insertions and Deletions respectively
Separator between any previous tags and variant tags
logical(1) Print progress messages while running
Sequence names to be mogrified
GRanges object passed to VariantAnnotation::ScanVcfParam if using a VCF directly
XStringSet with variant modified sequences
This function is designed to create a variant-modified reference genome, intended to be included as a set of decoys when using salmon in selective alignment mode. Sequence lengths will change if InDels are included and any coordinate-based information will be lost on the output of this function.
Tags are able to be added to any modified sequence to assist identifying any changes that have been made to a sequence.
library(GenomicRanges)
dna <- DNAStringSet(c(chr1 = "ACGT", chr2 = "AATTT"))
var <- GRanges(c("chr1:1", "chr1:3", "chr2:1-3"))
var$ALT <- c("C", "GG", "A")
dna
#> DNAStringSet object of length 2:
#> width seq names
#> [1] 4 ACGT chr1
#> [2] 5 AATTT chr2
genomogrify(dna, var)
#> Updating chr1; Original length: 4
#> ; Updated length: 5
#> Updating chr2; Original length: 5
#> ; Updated length: 3
#> DNAStringSet object of length 2:
#> width seq names
#> [1] 5 CCGGT chr1
#> [2] 3 ATT chr2
genomogrify(dna, var, tag = "mod")
#> Updating chr1; Original length: 4
#> ; Updated length: 5
#> Updating chr2; Original length: 5
#> ; Updated length: 3
#> DNAStringSet object of length 2:
#> width seq names
#> [1] 5 CCGGT chr1_mod
#> [2] 3 ATT chr2_mod
genomogrify(dna, var, var_tags = TRUE)
#> Updating chr1; Original length: 4
#> ; Updated length: 5
#> Updating chr2; Original length: 5
#> ; Updated length: 3
#> DNAStringSet object of length 2:
#> width seq names
#> [1] 5 CCGGT chr1_si
#> [2] 3 ATT chr2_d
genomogrify(dna, var, mask = GRanges("chr2:1-5"), var_tags = TRUE)
#> Applying mask
#> Updating chr1; Original length: 4
#> ; Updated length: 5
#> DNAStringSet object of length 2:
#> width seq names
#> [1] 5 CCGGT chr1_si
#> [2] 5 NNNNN chr2