Merge overlapping ranges based on p-values — mergeBySig • extraChIPs

Merge overlapping windows using p-values from significance testing

mergeBySig(x, ...)

# S4 method for class 'GenomicRanges'
mergeBySig(
  x,
  df = NULL,
  logfc = "logFC",
  pval = "P",
  cpm = "logCPM",
  inc_cols,
  p_adj_method = "fdr",
  alpha = 0.05,
  method = c("combine", "best", "minimal"),
  merge_within = 1L,
  ignore_strand = TRUE,
  min_win = 1,
  ...
)

# S4 method for class 'RangedSummarizedExperiment'
mergeBySig(
  x,
  df = NULL,
  logfc = "logFC",
  pval = "P",
  cpm = "logCPM",
  inc_cols,
  p_adj_method = "fdr",
  alpha = 0.05,
  method = c("combine", "best", "minimal"),
  merge_within = 1L,
  ignore_strand = TRUE,
  ...
)

Arguments

x: GenomicRanges object
...: Passed to all csaw functions being wrapped
df: data.frame with results of differential binding analysis performed using a sliding window strategy. If not provided, the columns in the mcols() element of x will be used
logfc, pval, cpm: Column names for the values holding window specific estimates of change in binding (logfc), overall signal intensity (cpm) and the significance from statistical testing (pval)
inc_cols: (Optional) Character vector of any additional columns in df to return
p_adj_method: One of p.adjust.methods
alpha: Significance threshold to apply during internal calculations
method: Shorthand versions for which csaw strategy to use for merging windows. Choose from 'combine' (combineTests), 'best' (getBestTest) or 'minimal' (minimalTests).
merge_within: Merge any non-overlapping windows within this distance
ignore_strand: Passed internally to reduce and findOverlaps
min_win: Only keep merged windows derived from at least this number

Value

A GenomicRanges object with overlapping ranges from the original object merged and representative values returned. The range corresponding to the representative values is also returned

Details

When using sliding windows to test for differential signal, overlapping windows can be merged based on the significance of results. mergeBySig() is a wrapper to the functions combineTests, getBestTest and minimalTests, using each function's approach to finding a representative window. The returned object differs from those returned by the original functions in that the description of windows as 'up', 'down' or mixed is omitted and the genomic range corresponding to the representative window is also returned. Column names also correspond to those in the original object.

An additional column with adjusted p-values is returned. This column retains the same name as the original but with the suffix '_*' added where the p-value adjustment method is added after the underscore.

Examples

x <- GRanges(c("chr1:1-10", "chr1:6-15", "chr1:51-60"))
set.seed(1001)
df <- DataFrame(logFC = rnorm(3), logCPM = rnorm(3,8), p = rexp(3, 10))
mcols(x) <- df
mergeBySig(x, pval = "p", method = "combine")
#> GRanges object with 2 ranges and 8 metadata columns:
#>       seqnames    ranges strand | n_windows      n_up    n_down keyval_range
#>          <Rle> <IRanges>  <Rle> | <integer> <integer> <integer>    <GRanges>
#>   [1]     chr1      1-15      * |         2         1         1    chr1:1-10
#>   [2]     chr1     51-60      * |         1         0         0   chr1:51-60
#>          logCPM     logFC         p     p_fdr
#>       <numeric> <numeric> <numeric> <numeric>
#>   [1]   5.49346  2.188648 0.0184835  0.036967
#>   [2]   7.85644 -0.185275 0.3201591  0.320159
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
mergeBySig(x, pval = "p", method = "best")
#> GRanges object with 2 ranges and 8 metadata columns:
#>       seqnames    ranges strand | n_windows      n_up    n_down keyval_range
#>          <Rle> <IRanges>  <Rle> | <integer> <integer> <integer>    <GRanges>
#>   [1]     chr1      1-15      * |         2         1         1    chr1:6-15
#>   [2]     chr1     51-60      * |         1         0         0   chr1:51-60
#>          logCPM     logFC         p     p_fdr
#>       <numeric> <numeric> <numeric> <numeric>
#>   [1]   7.44269 -0.177547 0.0252419 0.0504838
#>   [2]   7.85644 -0.185275 0.3201591 0.3201591
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths
mergeBySig(x, pval = "p", method = "min")
#> GRanges object with 2 ranges and 8 metadata columns:
#>       seqnames    ranges strand | n_windows      n_up    n_down keyval_range
#>          <Rle> <IRanges>  <Rle> | <integer> <integer> <integer>    <GRanges>
#>   [1]     chr1      1-15      * |         2         1         1    chr1:1-10
#>   [2]     chr1     51-60      * |         1         0         0   chr1:51-60
#>          logCPM     logFC         p     p_fdr
#>       <numeric> <numeric> <numeric> <numeric>
#>   [1]   5.49346  2.188648 0.0252419 0.0504838
#>   [2]   7.85644 -0.185275 0.3201591 0.3201591
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths