Make a set of consensus peaks based on the number of replicates
makeConsensus(
x,
p = 0,
var = NULL,
method = c("union", "coverage"),
ignore.strand = TRUE,
simplify = FALSE,
min_width = 0,
merge_within = 1L,
...
)
A GRangesList
The minimum proportion of samples (i.e. elements of x
) required
for a peak to be retained in the output. By default all merged peaks will
be returned
Additional columns in the mcols
element to retain
Either return the union of all overlapping ranges, or the regions within the overlapping ranges which are covered by the specified proportion of replicates. When using p = 0, both methods will return identical results
Passed to reduceMC or intersectMC internally
Discard any regions below this width
Passed to reduce as min.gapwidth
GRanges
object with mcols containing a logical vector for every element of
x, along with the column n
which adds all logical columns. These columns
denote which replicates contain an overlapping peak for each range
If any additional columns have been requested using var
, these will be
returned as CompressedList objects as produced by reduceMC()
or
intersectMC()
.
This takes a list of GRanges objects and forms a set of consensus peaks.
When using method = "union" the union ranges of all overlapping peaks will be returned, using the minimum proportion of replicates specified. When using method = "coverage", only the regions within each overlapping range which are 'covered' by the minimum proportion of replicates specified are returned. This will return narrower peaks in general, although some artefactual very small ranges may be included (e.g. 10bp). Careful setting of the min_width and merge_within parameters may be very helpful for these instances. It is also expected that setting method = "coverage" should return the region within each range which is more likely to contain the true binding site for the relevant ChIP targets
data("peaks")
## The first three replicates are from the same treatment group
grl <- peaks[1:3]
names(grl) <- gsub("_peaks.+", "", names(grl))
makeConsensus(grl)
#> GRanges object with 244 ranges and 4 metadata columns:
#> seqnames ranges strand | SRR8315180 SRR8315181 SRR8315182
#> <Rle> <IRanges> <Rle> | <logical> <logical> <logical>
#> [1] chr10 43048195-43048529 * | TRUE TRUE TRUE
#> [2] chr10 43521739-43522260 * | TRUE TRUE TRUE
#> [3] chr10 43540042-43540390 * | TRUE FALSE TRUE
#> [4] chr10 43606238-43606573 * | TRUE TRUE TRUE
#> [5] chr10 43851214-43851989 * | FALSE TRUE TRUE
#> ... ... ... ... . ... ... ...
#> [240] chr10 99168353-99168649 * | TRUE TRUE TRUE
#> [241] chr10 99207868-99208156 * | FALSE TRUE TRUE
#> [242] chr10 99326100-99326363 * | FALSE FALSE TRUE
#> [243] chr10 99331363-99331730 * | TRUE TRUE TRUE
#> [244] chr10 99621632-99621961 * | FALSE TRUE TRUE
#> n
#> <numeric>
#> [1] 3
#> [2] 3
#> [3] 2
#> [4] 3
#> [5] 2
#> ... ...
#> [240] 3
#> [241] 2
#> [242] 1
#> [243] 3
#> [244] 2
#> -------
#> seqinfo: 25 sequences from GRCh37 genome
makeConsensus(grl, p = 2/3, var = "score")
#> GRanges object with 164 ranges and 5 metadata columns:
#> seqnames ranges strand | score SRR8315180 SRR8315181
#> <Rle> <IRanges> <Rle> | <NumericList> <logical> <logical>
#> [1] chr10 43048195-43048529 * | 251,531,391 TRUE TRUE
#> [2] chr10 43521739-43522260 * | 223,548,645 TRUE TRUE
#> [3] chr10 43540042-43540390 * | 58,206 TRUE FALSE
#> [4] chr10 43606238-43606573 * | 92,192,302 TRUE TRUE
#> [5] chr10 43851214-43851989 * | 87,148 FALSE TRUE
#> ... ... ... ... . ... ... ...
#> [160] chr10 99096784-99097428 * | 196,117,308 TRUE TRUE
#> [161] chr10 99168353-99168649 * | 48,105,130 TRUE TRUE
#> [162] chr10 99207868-99208156 * | 94,137 FALSE TRUE
#> [163] chr10 99331363-99331730 * | 210,258,504 TRUE TRUE
#> [164] chr10 99621632-99621961 * | 74,75 FALSE TRUE
#> SRR8315182 n
#> <logical> <numeric>
#> [1] TRUE 3
#> [2] TRUE 3
#> [3] TRUE 2
#> [4] TRUE 3
#> [5] TRUE 2
#> ... ... ...
#> [160] TRUE 3
#> [161] TRUE 3
#> [162] TRUE 2
#> [163] TRUE 3
#> [164] TRUE 2
#> -------
#> seqinfo: 25 sequences from GRCh37 genome
## Using method = 'coverage' finds ranges based on the intersection
makeConsensus(grl, p = 2/3, var = "score", method = "coverage")
#> GRanges object with 164 ranges and 5 metadata columns:
#> seqnames ranges strand | score SRR8315180 SRR8315181
#> <Rle> <IRanges> <Rle> | <NumericList> <logical> <logical>
#> [1] chr10 43048224-43048519 * | 251,531,391 TRUE TRUE
#> [2] chr10 43521773-43522218 * | 223,548,645 TRUE TRUE
#> [3] chr10 43540200-43540384 * | 58,206 TRUE FALSE
#> [4] chr10 43606264-43606559 * | 92,192,302 TRUE TRUE
#> [5] chr10 43851570-43851940 * | 87,148 FALSE TRUE
#> ... ... ... ... . ... ... ...
#> [160] chr10 99097038-99097398 * | 196,117,308 TRUE TRUE
#> [161] chr10 99168367-99168608 * | 48,105,130 TRUE TRUE
#> [162] chr10 99207908-99208139 * | 94,137 FALSE TRUE
#> [163] chr10 99331412-99331687 * | 210,258,504 TRUE TRUE
#> [164] chr10 99621674-99621944 * | 74,75 FALSE TRUE
#> SRR8315182 n
#> <logical> <numeric>
#> [1] TRUE 3
#> [2] TRUE 3
#> [3] TRUE 2
#> [4] TRUE 3
#> [5] TRUE 2
#> ... ... ...
#> [160] TRUE 3
#> [161] TRUE 3
#> [162] TRUE 2
#> [163] TRUE 3
#> [164] TRUE 2
#> -------
#> seqinfo: 25 sequences from GRCh37 genome