Form a set of random, matching ranges for bootstrapping or permuting
Source:R/makeRMRanges.R
makeRMRanges-methods.Rd
Form a set of ranges from y which (near) exactly match those in x for use as a background set requiring matching
Usage
makeRMRanges(x, y, ...)
# S4 method for class 'GRanges,GRanges'
makeRMRanges(
x,
y,
exclude = GRanges(),
n_iter = 1,
n_total = NULL,
replace = TRUE,
...,
force_ol = TRUE
)
# S4 method for class 'GRangesList,GRangesList'
makeRMRanges(
x,
y,
exclude = GRanges(),
n_iter = 1,
n_total = NULL,
replace = TRUE,
mc.cores = 1,
...,
force_ol = TRUE,
unlist = TRUE
)
Arguments
- x
GRanges/GRangesList with ranges to be matched
- y
GRanges/GRangesList with ranges to select random matching ranges from
- ...
Not used
- exclude
GRanges of ranges to omit from testing
- n_iter
The number of times to repeat the random selection process
- n_total
Setting this value will over-ride anything set using n_iter. Can be vector of any length, corresponding to the length of x, when x is a GRangesList
- replace
logical(1) Sample with our without replacement when creating the set of random ranges.
- force_ol
logical(1) Enforce an overlap between every site in x and y
- mc.cores
Passsed to mclapply
- unlist
logical(1) Return as a sorted GRanges object, or leave as a GRangesList
Details
This function uses the width distribution of the 'test' ranges (i.e. x
) to
randomly sample a set of ranges with matching width from the ranges provided
in y
. The width distribution will clearly be exact when a set of
fixed-width ranges is passed to x
, whilst random sampling may yield some
variability when matching ranges of variable width.
When both x and y are GRanges objects, they are implicitly assumed to both
represent similar ranges, such as those overlapping a promoter or enhancer.
When passing two GRangesList objects, both objects are expected to contain
ranges annotated as belonging to key features, such that the list elements in
y must encompass all elements in x.
For example if x
contains two elements named 'promoter' and 'intron', y
should also contain elements named 'promoter' and 'intron' and these will
be sampled as matching ranges for the same element in x
. If elements of
x
and y
are not named, they are assumed to be in matching order.
The default behaviour is to assume that randomly-generated ranges are for
iteration, and as such, ranges are randomly formed in multiples of the number
of 'test' ranges provided in x
. The column iteration
will be added to the
returned ranges.
Placing any number into the n_total
argument will instead select a total
number of ranges as specified here. In this case, no iteration
column will
be included in the returned ranges.
Sampling is assumed to be with replacement as this is most suitable for
bootstrapping and related procedures, although this can be disabled by
setting replace = FALSE
Examples
## Load the example peaks
data("ar_er_peaks")
sq <- seqinfo(ar_er_peaks)
## Now sample size-matched ranges for two iterations from chr1
makeRMRanges(ar_er_peaks, GRanges(sq)[1], n_iter = 2)
#> GRanges object with 1698 ranges and 1 metadata column:
#> seqnames ranges strand | iteration
#> <Rle> <IRanges> <Rle> | <integer>
#> [1] chr1 91036-91435 * | 2
#> [2] chr1 151570-151969 * | 2
#> [3] chr1 156793-157192 * | 1
#> [4] chr1 335925-336324 * | 2
#> [5] chr1 365758-366157 * | 1
#> ... ... ... ... . ...
#> [1694] chr1 248443254-248443653 * | 2
#> [1695] chr1 248493084-248493483 * | 2
#> [1696] chr1 248801396-248801795 * | 1
#> [1697] chr1 248869157-248869556 * | 2
#> [1698] chr1 249196752-249197151 * | 1
#> -------
#> seqinfo: 24 sequences from hg19 genome
## Or simply sample 100 ranges if not planning any iterative analyses
makeRMRanges(ar_er_peaks, GRanges(sq)[1], n_total = 100)
#> GRanges object with 100 ranges and 0 metadata columns:
#> seqnames ranges strand
#> <Rle> <IRanges> <Rle>
#> [1] chr1 1594492-1594891 *
#> [2] chr1 6204139-6204538 *
#> [3] chr1 8926298-8926697 *
#> [4] chr1 10338571-10338970 *
#> [5] chr1 11417646-11418045 *
#> ... ... ... ...
#> [96] chr1 237632083-237632482 *
#> [97] chr1 241318705-241319104 *
#> [98] chr1 243568836-243569235 *
#> [99] chr1 245708450-245708849 *
#> [100] chr1 247427291-247427690 *
#> -------
#> seqinfo: 24 sequences from hg19 genome