Perform set operations retaining all mcols from the query range

setdiffMC(x, y, ...)

intersectMC(x, y, ...)

unionMC(x, y, ...)

# S4 method for class 'GRanges,GRanges'
setdiffMC(x, y, ignore.strand = FALSE, simplify = TRUE, ...)

# S4 method for class 'GRanges,GRanges'
intersectMC(x, y, ignore.strand = FALSE, simplify = TRUE, ...)

# S4 method for class 'GRanges,GRanges'
unionMC(x, y, ignore.strand = FALSE, simplify = TRUE, ...)

Arguments

x, y

GenomicRanges objects

...

Not used

ignore.strand

If set to TRUE, then the strand of x and y is set to "*" prior to any computation.

simplify

logical(1) If TRUE, any List columns will be returned as vectors where possible. This can only occur if single, unique entries are present in all initial elements.

Value

A GRanges object with all mcols returned form the original object. If a range obtained by setdiff maps back to two or more ranges in the original set of Ranges, mcols will be returned as CompressedList columns

Details

This extends the methods provided by setdiff, intersect and union so that mcols from x will be returned as part of the output.

Where output ranges map back to multiple ranges in x, CompressedList columns will be returned. By default, these will be simplified if possible, however this behaviour can be disabled by setting simplify = FALSE.

All columns will be returned which can also be time-consuming. A wise approach is to only provide columns you require as part of the query ranges x.

If more nuanced approaches are required, the returned columns can be further modified by many functions included in the plyranges package, such as mutate().

Examples

x <- GRanges("chr1:1-100:+")
x$id <- "range1"
y <- GRanges(c("chr1:51-60:+", "chr1:21-30:-"))
setdiffMC(x, y)
#> GRanges object with 2 ranges and 1 metadata column:
#>       seqnames    ranges strand |          id
#>          <Rle> <IRanges>  <Rle> | <character>
#>   [1]     chr1      1-50      + |      range1
#>   [2]     chr1    61-100      + |      range1
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
setdiffMC(x, y, ignore.strand = TRUE)
#> GRanges object with 3 ranges and 1 metadata column:
#>       seqnames    ranges strand |          id
#>          <Rle> <IRanges>  <Rle> | <character>
#>   [1]     chr1      1-20      * |      range1
#>   [2]     chr1     31-50      * |      range1
#>   [3]     chr1    61-100      * |      range1
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths

# The intersection works similarly
intersectMC(x, y)
#> GRanges object with 1 range and 1 metadata column:
#>       seqnames    ranges strand |          id
#>          <Rle> <IRanges>  <Rle> | <character>
#>   [1]     chr1     51-60      + |      range1
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths

# Union may contain ranges not initially in x
unionMC(x, y)
#> GRanges object with 2 ranges and 1 metadata column:
#>       seqnames    ranges strand |          id
#>          <Rle> <IRanges>  <Rle> | <character>
#>   [1]     chr1     1-100      + |      range1
#>   [2]     chr1     21-30      - |        <NA>
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
unionMC(x, y, ignore.strand = TRUE)
#> GRanges object with 1 range and 1 metadata column:
#>       seqnames    ranges strand |          id
#>          <Rle> <IRanges>  <Rle> | <character>
#>   [1]     chr1     1-100      * |      range1
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths