Make a set of consensus peaks based on the number of replicates

makeConsensus(
  x,
  p = 0,
  var = NULL,
  method = c("union", "coverage"),
  ignore.strand = TRUE,
  simplify = FALSE,
  min_width = 0,
  merge_within = 1L,
  ...
)

Arguments

x

A GRangesList

p

The minimum proportion of samples (i.e. elements of x) required for a peak to be retained in the output. By default all merged peaks will be returned

var

Additional columns in the mcols element to retain

method

Either return the union of all overlapping ranges, or the regions within the overlapping ranges which are covered by the specified proportion of replicates. When using p = 0, both methods will return identical results

ignore.strand, simplify, ...

Passed to reduceMC or intersectMC internally

min_width

Discard any regions below this width

merge_within

Passed to reduce as min.gapwidth

Value

GRanges object with mcols containing a logical vector for every element of x, along with the column n which adds all logical columns. These columns denote which replicates contain an overlapping peak for each range

If any additional columns have been requested using var, these will be returned as CompressedList objects as produced by reduceMC() or intersectMC().

Details

This takes a list of GRanges objects and forms a set of consensus peaks.

When using method = "union" the union ranges of all overlapping peaks will be returned, using the minimum proportion of replicates specified. When using method = "coverage", only the regions within each overlapping range which are 'covered' by the minimum proportion of replicates specified are returned. This will return narrower peaks in general, although some artefactual very small ranges may be included (e.g. 10bp). Careful setting of the min_width and merge_within parameters may be very helpful for these instances. It is also expected that setting method = "coverage" should return the region within each range which is more likely to contain the true binding site for the relevant ChIP targets

Examples

data("peaks")
## The first three replicates are from the same treatment group
grl <- peaks[1:3]
names(grl) <- gsub("_peaks.+", "", names(grl))
makeConsensus(grl)
#> GRanges object with 244 ranges and 4 metadata columns:
#>         seqnames            ranges strand | SRR8315180 SRR8315181 SRR8315182
#>            <Rle>         <IRanges>  <Rle> |  <logical>  <logical>  <logical>
#>     [1]    chr10 43048195-43048529      * |       TRUE       TRUE       TRUE
#>     [2]    chr10 43521739-43522260      * |       TRUE       TRUE       TRUE
#>     [3]    chr10 43540042-43540390      * |       TRUE      FALSE       TRUE
#>     [4]    chr10 43606238-43606573      * |       TRUE       TRUE       TRUE
#>     [5]    chr10 43851214-43851989      * |      FALSE       TRUE       TRUE
#>     ...      ...               ...    ... .        ...        ...        ...
#>   [240]    chr10 99168353-99168649      * |       TRUE       TRUE       TRUE
#>   [241]    chr10 99207868-99208156      * |      FALSE       TRUE       TRUE
#>   [242]    chr10 99326100-99326363      * |      FALSE      FALSE       TRUE
#>   [243]    chr10 99331363-99331730      * |       TRUE       TRUE       TRUE
#>   [244]    chr10 99621632-99621961      * |      FALSE       TRUE       TRUE
#>                 n
#>         <numeric>
#>     [1]         3
#>     [2]         3
#>     [3]         2
#>     [4]         3
#>     [5]         2
#>     ...       ...
#>   [240]         3
#>   [241]         2
#>   [242]         1
#>   [243]         3
#>   [244]         2
#>   -------
#>   seqinfo: 25 sequences from GRCh37 genome
makeConsensus(grl, p = 2/3, var = "score")
#> GRanges object with 164 ranges and 5 metadata columns:
#>         seqnames            ranges strand |         score SRR8315180 SRR8315181
#>            <Rle>         <IRanges>  <Rle> | <NumericList>  <logical>  <logical>
#>     [1]    chr10 43048195-43048529      * |   251,531,391       TRUE       TRUE
#>     [2]    chr10 43521739-43522260      * |   223,548,645       TRUE       TRUE
#>     [3]    chr10 43540042-43540390      * |        58,206       TRUE      FALSE
#>     [4]    chr10 43606238-43606573      * |    92,192,302       TRUE       TRUE
#>     [5]    chr10 43851214-43851989      * |        87,148      FALSE       TRUE
#>     ...      ...               ...    ... .           ...        ...        ...
#>   [160]    chr10 99096784-99097428      * |   196,117,308       TRUE       TRUE
#>   [161]    chr10 99168353-99168649      * |    48,105,130       TRUE       TRUE
#>   [162]    chr10 99207868-99208156      * |        94,137      FALSE       TRUE
#>   [163]    chr10 99331363-99331730      * |   210,258,504       TRUE       TRUE
#>   [164]    chr10 99621632-99621961      * |         74,75      FALSE       TRUE
#>         SRR8315182         n
#>          <logical> <numeric>
#>     [1]       TRUE         3
#>     [2]       TRUE         3
#>     [3]       TRUE         2
#>     [4]       TRUE         3
#>     [5]       TRUE         2
#>     ...        ...       ...
#>   [160]       TRUE         3
#>   [161]       TRUE         3
#>   [162]       TRUE         2
#>   [163]       TRUE         3
#>   [164]       TRUE         2
#>   -------
#>   seqinfo: 25 sequences from GRCh37 genome

## Using method = 'coverage' finds ranges based on the intersection
makeConsensus(grl, p = 2/3, var = "score", method = "coverage")
#> GRanges object with 164 ranges and 5 metadata columns:
#>         seqnames            ranges strand |         score SRR8315180 SRR8315181
#>            <Rle>         <IRanges>  <Rle> | <NumericList>  <logical>  <logical>
#>     [1]    chr10 43048224-43048519      * |   251,531,391       TRUE       TRUE
#>     [2]    chr10 43521773-43522218      * |   223,548,645       TRUE       TRUE
#>     [3]    chr10 43540200-43540384      * |        58,206       TRUE      FALSE
#>     [4]    chr10 43606264-43606559      * |    92,192,302       TRUE       TRUE
#>     [5]    chr10 43851570-43851940      * |        87,148      FALSE       TRUE
#>     ...      ...               ...    ... .           ...        ...        ...
#>   [160]    chr10 99097038-99097398      * |   196,117,308       TRUE       TRUE
#>   [161]    chr10 99168367-99168608      * |    48,105,130       TRUE       TRUE
#>   [162]    chr10 99207908-99208139      * |        94,137      FALSE       TRUE
#>   [163]    chr10 99331412-99331687      * |   210,258,504       TRUE       TRUE
#>   [164]    chr10 99621674-99621944      * |         74,75      FALSE       TRUE
#>         SRR8315182         n
#>          <logical> <numeric>
#>     [1]       TRUE         3
#>     [2]       TRUE         3
#>     [3]       TRUE         2
#>     [4]       TRUE         3
#>     [5]       TRUE         2
#>     ...        ...       ...
#>   [160]       TRUE         3
#>   [161]       TRUE         3
#>   [162]       TRUE         2
#>   [163]       TRUE         3
#>   [164]       TRUE         2
#>   -------
#>   seqinfo: 25 sequences from GRCh37 genome