extraChIPs

Bioc 2023 Package Demonstration

Stevie Pederson (They/Them)

August 2, 2023

https://smped.github.io/Bioc2023_extraChIPs/

Introduction

Who Am I?

Post-doctoral Bioinformatician at Black Ochre Data Labs
- Telethon Kids Institute, Adelaide, Australia
- Traditional lands of the Kaurna nation
Improving health outcomes for Indigenous Australians
- Type 2 Diabetes, Cardiovascular Disease, Chronic Kidney Disease
1400 participants in multi-omics study
- Genomics, Epigenomics, Transcriptomics, Proteomics, Lipidomics etc

Also the developer/maintainer of ngsReports

Why `extraChIPs`?

Dame Roma Mitchell Cancer Research Laboratories (UofA, 2020-2022)
- Integrating multiple ChIP targets across multiple cell lines
- Regulatory dynamics of AR activation with DHT (or SARMs)

Built to enable the GRAVI workflow:
- Gene Regulatory Analysis using Variable Inputs (Poster Thursday)
- Integration of multiple ChIP targets with RNA-Seq + HiC
- Wanted to standardise ‘best practice’ across multiple datasets
- Design evolved alongside GRAVI

Filled gaps between DiffBind (Ross-Innes et al. 2012) and csaw (Lun and Smyth 2016)

Why `extraChIPs`?

Simplify common processes:
- Flexible detection of Differential Signal (Fixed-Width or Sliding Windows)
- Maintaining mcols throughout an analysis
- Easier comparison across multiple targets or treatments
- Enable common visualisations

Integration with the tidyverse (Wickham et al. 2019) + plyranges (Lee et al. 2019)

Use existing object classes:
- SummarizedExperiment, GRanges, GRangesList, List

Applicable well beyond ChIP-Seq (DNA-methylation, ATAC-Seq etc)

Key Functions

Core Infrastructure

reduceMC, intersectMC, setdiffMC, unionMC
chopMC, distinctMC
as_tibble, colToRanges

Working With Peaks/GRanges

importPeaks, makeConsensus, plotOverlaps
bestOverlap, propOverlap
defineRegions, mapByFeature
mapGrlCols ¹

Differential Signal Analysis

fitAssayDiff, addDiffStatus
dualFilter,
mergeByHMP, mergeByCol, mergeBySig

Visualisations

plotAssayDensities, plotAssayPCA, plotAssayRle
plotPie, plotSplitDonut
plotPairwise ¹
plotProfileHeatmap, plotHFGC

Core Infrastructure

*MC Functions

Existing functions reduce, setdiff, intersect, union
- All will drop information in the mcols element

reduceMC, setdiffMC, intersectMC and unionMC
- Retain all information in mcols element
- Returned as CompressedList or atomic vectors
- Heavily used internally

`reduce` Vs. `reduceMC`

GRanges object with 2 ranges and 2 metadata columns:
      seqnames    ranges strand |          id        gene
         <Rle> <IRanges>  <Rle> | <character> <character>
  [1]     chr1      1-10      * |      range1       gene1
  [2]     chr1      6-12      * |      range2       gene1
  -------
  seqinfo: 1 sequence from example genome

reduce will simply combine (i.e. reduce) overlapping ranges

reduce(x)

GRanges object with 1 range and 0 metadata columns:
      seqnames    ranges strand
         <Rle> <IRanges>  <Rle>
  [1]     chr1      1-12      *
  -------
  seqinfo: 1 sequence from example genome

`reduce` Vs. `reduceMC`

GRanges object with 2 ranges and 2 metadata columns:
      seqnames    ranges strand |          id        gene
         <Rle> <IRanges>  <Rle> | <character> <character>
  [1]     chr1      1-10      * |      range1       gene1
  [2]     chr1      6-12      * |      range2       gene1
  -------
  seqinfo: 1 sequence from example genome

reduceMC will also combine values in the mcols element

reduceMC(x)

GRanges object with 1 range and 2 metadata columns:
      seqnames    ranges strand |              id        gene
         <Rle> <IRanges>  <Rle> | <CharacterList> <character>
  [1]     chr1      1-12      * |   range1,range2       gene1
  -------
  seqinfo: 1 sequence from example genome

mcols are simplified by default (simplify = TRUE)

`tibble` Coercion

GRanges can be coerced to tibble objects
Ranges are coerced to a character column

tbl <- as_tibble(x)
tbl

# A tibble: 2 × 3
  range     id     gene 
  <chr>     <chr>  <chr>
1 chr1:1-10 range1 gene1
2 chr1:6-12 range2 gene1

rangeAsChar = FALSE will produce similar to as.data.frame

as_tibble(x, rangeAsChar = FALSE)

# A tibble: 2 × 7
  seqnames start   end width strand id     gene 
  <fct>    <int> <int> <int> <fct>  <chr>  <chr>
1 chr1         1    10    10 *      range1 gene1
2 chr1         6    12     7 *      range2 gene1

Coercing `tibble` objects back to `GRanges`

tbl

# A tibble: 2 × 3
  range     id     gene 
  <chr>     <chr>  <chr>
1 chr1:1-10 range1 gene1
2 chr1:6-12 range2 gene1

colToRanges(tbl, var = "range")

GRanges object with 2 ranges and 2 metadata columns:
      seqnames    ranges strand |          id        gene
         <Rle> <IRanges>  <Rle> | <character> <character>
  [1]     chr1      1-10      * |      range1       gene1
  [2]     chr1      6-12      * |      range2       gene1
  -------
  seqinfo: 1 sequence from an unspecified genome; no seqlengths

Seqinfo objects can also be provided

colToRanges(tbl, var = "range", seqinfo = seqinfo(x))

GRanges object with 2 ranges and 2 metadata columns:
      seqnames    ranges strand |          id        gene
         <Rle> <IRanges>  <Rle> | <character> <character>
  [1]     chr1      1-10      * |      range1       gene1
  [2]     chr1      6-12      * |      range2       gene1
  -------
  seqinfo: 1 sequence from example genome

`colToRanges`

colToRanges can also move GRanges objects in mcols to the ‘backbone’ ranges

x$centre <- GRanges(paste0("chr1:", c(5, 10)), seqinfo = seqinfo(x))
x

GRanges object with 2 ranges and 3 metadata columns:
      seqnames    ranges strand |          id        gene    centre
         <Rle> <IRanges>  <Rle> | <character> <character> <GRanges>
  [1]     chr1      1-10      * |      range1       gene1    chr1:5
  [2]     chr1      6-12      * |      range2       gene1   chr1:10
  -------
  seqinfo: 1 sequence from example genome

colToRanges(x, var = "centre")

GRanges object with 2 ranges and 2 metadata columns:
      seqnames    ranges strand |          id        gene
         <Rle> <IRanges>  <Rle> | <character> <character>
  [1]     chr1         5      * |      range1       gene1
  [2]     chr1        10      * |      range2       gene1
  -------
  seqinfo: 1 sequence from example genome

Working With Peaks

Demonstration Data ¹

## Define Samples
targets <- c("ER", "H3K27ac")
treat_levels <- c("E2", "E2DHT")
samples <- tibble(
  accession = paste0("SRR83151", 80:91),
  target = rep(targets, each = 6),
  treatment = treat_levels %>% 
    rep(each = 3) %>% 
    rep_len(12) %>% 
    factor(levels = treat_levels),
  input = "SRR8315192"
)
## Split for easier wrangling
samples <- split(samples, samples$target)
samples

$ER
# A tibble: 6 × 4
  accession  target treatment input     
  <chr>      <chr>  <fct>     <chr>     
1 SRR8315180 ER     E2        SRR8315192
2 SRR8315181 ER     E2        SRR8315192
3 SRR8315182 ER     E2        SRR8315192
4 SRR8315183 ER     E2DHT     SRR8315192
5 SRR8315184 ER     E2DHT     SRR8315192
6 SRR8315185 ER     E2DHT     SRR8315192

$H3K27ac
# A tibble: 6 × 4
  accession  target  treatment input     
  <chr>      <chr>   <fct>     <chr>     
1 SRR8315186 H3K27ac E2        SRR8315192
2 SRR8315187 H3K27ac E2        SRR8315192
3 SRR8315188 H3K27ac E2        SRR8315192
4 SRR8315189 H3K27ac E2DHT     SRR8315192
5 SRR8315190 H3K27ac E2DHT     SRR8315192
6 SRR8315191 H3K27ac E2DHT     SRR8315192

Data Directory Contents

-- annotations
   |__chr10_blacklist.bed
   |__chr10_greylist.bed
   |__gencode.v43lift37.chr10.annotation.gtf.gz
-- ER
   |__E2_cov_chr10.bw
   |__E2DHT_cov_chr10.bw
   |__SRR8315180_peaks.narrowPeak
   |__SRR8315180.bam
   |__SRR8315180.bam.bai
   |__SRR8315181_peaks.narrowPeak
   |__SRR8315181.bam
   |__SRR8315181.bam.bai
   |__SRR8315182_peaks.narrowPeak
   |__SRR8315182.bam
   |__SRR8315182.bam.bai
   |__SRR8315183_peaks.narrowPeak
   |__SRR8315183.bam
   |__SRR8315183.bam.bai
   |__SRR8315184_peaks.narrowPeak
   |__SRR8315184.bam
   |__SRR8315184.bam.bai
   |__SRR8315185_peaks.narrowPeak
   |__SRR8315185.bam
   |__SRR8315185.bam.bai
-- H3K27ac
   |__E2_cov_chr10.bw
   |__E2DHT_cov_chr10.bw
   |__H3K27ac_chr10.bed
   |__SRR8315186.bam
   |__SRR8315186.bam.bai
   |__SRR8315187.bam
   |__SRR8315187.bam.bai
   |__SRR8315188.bam
   |__SRR8315188.bam.bai
   |__SRR8315189.bam
   |__SRR8315189.bam.bai
   |__SRR8315190.bam
   |__SRR8315190.bam.bai
   |__SRR8315191.bam
   |__SRR8315191.bam.bai
   |__SRR8315192.bam
   |__SRR8315192.bam.bai

Additional Objects

## A consistent Seqinfo object
sq <- Seqinfo(
  seqnames = "chr10", seqlengths = 135534747, isCircular = FALSE,  genome = "hg19"
)

## Gene Annotations as a GTF, split by feature type
gtf <- file.path(
  "data", "annotations", "gencode.v43lift37.chr10.annotation.gtf.gz"
) %>% 
  import.gff() %>% 
  split(.$type)
seqinfo(gtf) <- sq

## Black & grey-lists (From ENCODE & Input Sample)
bg_list <- file.path("data", "annotations", c("chr10_blacklist.bed", "chr10_greylist.bed")) %>% 
  lapply(import.bed, seqinfo = sq) %>% 
  lapply(granges) %>% 
  GRangesList() %>% 
  unlist() %>% 
  sort()

## Setup colours for consistent plotting
treat_colours <- c(E2 = "steelblue", E2DHT = "red3")
sample_colours <- setNames(
  treat_colours[bind_rows(samples)$treatment], 
  bind_rows(samples)$accession
)

Working With Peaks

Key Steps:

Load peaks identified in individual samples (e.g. macs2 callpeak)

Find overlapping regions from \(> p\times n\) replicates \(\implies\) consensus peaks
- Usually formed within treatment groups
- Merged across treatment groups

Peak callers often produce narrowPeak/broadPeak files

Importing Peaks

importPeaks(
  x,
  type = c("narrow", "broad"),
  blacklist,
  seqinfo,
  pruning.mode = c("coarse", "error"),
  sort = TRUE,
  setNames = TRUE,
  glueNames = "{basename(x)}",
  centre = FALSE,
  nameRanges = TRUE,
  ...
)

Import broad/narrowPeak files \(\implies\) GRangesList
Blacklisted regions can be omitted during parsing
Only chromosomes/scaffolds within the seqinfo object will be returned
Peak Centres can be automatically added for narrowPeak files

Importing Peaks

fl <- file.path("data", "ER", paste0(samples$ER$accession, "_peaks.narrowPeak"))
peaks <- importPeaks(fl, blacklist = bg_list, seqinfo = sq, centre = TRUE)
names(peaks) <- samples$ER$accession
peaks

GRangesList object of length 6:
$SRR8315180
GRanges object with 123 ranges and 6 metadata columns:
                      seqnames            ranges strand |     score signalValue
                         <Rle>         <IRanges>  <Rle> | <numeric>   <numeric>
  SRR8315180_peak_698    chr10 43048224-43048487      * |       251    13.91781
  SRR8315180_peak_699    chr10 43521883-43522126      * |       223    12.88686
  SRR8315180_peak_700    chr10 43540200-43540384      * |        58     6.18569
  SRR8315180_peak_701    chr10 43606308-43606517      * |        92     7.73212
  SRR8315180_peak_702    chr10 44092187-44092576      * |       251    13.91781
                  ...      ...               ...    ... .       ...         ...
  SRR8315180_peak_819    chr10 95796039-95796377      * |       434    18.67842
  SRR8315180_peak_820    chr10 96089595-96089749      * |        25     4.32096
  SRR8315180_peak_821    chr10 99097150-99097384      * |       196     9.02984
  SRR8315180_peak_822    chr10 99168353-99168578      * |        48     5.67022
  SRR8315180_peak_823    chr10 99331430-99331686      * |       210    10.70668
                         pValue    qValue      peak    centre
                      <numeric> <numeric> <numeric> <numeric>
  SRR8315180_peak_698  29.15643  25.19601       117  43048341
  SRR8315180_peak_699  26.25515  22.33688       125  43522008
  SRR8315180_peak_700   9.37883   5.87296        98  43540298
  SRR8315180_peak_701  12.90188   9.27326       110  43606418
  SRR8315180_peak_702  29.15643  25.19601       214  44092401
                  ...       ...       ...       ...       ...
  SRR8315180_peak_819  47.62759  43.44466       186  95796225
  SRR8315180_peak_820   5.88899   2.58967         6  96089601
  SRR8315180_peak_821  23.51470  19.64532       109  99097259
  SRR8315180_peak_822   8.26998   4.82385       153  99168506
  SRR8315180_peak_823  24.91666  21.02332       191  99331621
  -------
  seqinfo: 1 sequence from hg19 genome

...
<5 more elements>

Comparing Replicates

plotOverlaps(peaks, set_col = sample_colours)

Calls ComplexUpset::upset() if n \(\geq\) 3

makeConsensus

makeConsensus(peaks[1:3], var = "centre", p = 2/3)

GRanges object with 164 ranges and 5 metadata columns:
        seqnames            ranges strand |                     centre
           <Rle>         <IRanges>  <Rle> |              <NumericList>
    [1]    chr10 43048195-43048529      * | 43048341,43048357,43048389
    [2]    chr10 43521739-43522260      * | 43522008,43522012,43522039
    [3]    chr10 43540042-43540390      * |          43540298,43540245
    [4]    chr10 43606238-43606573      * | 43606418,43606408,43606421
    [5]    chr10 43851214-43851989      * |          43851672,43851766
    ...      ...               ...    ... .                        ...
  [160]    chr10 99096784-99097428      * | 99097259,99097253,99097249
  [161]    chr10 99168353-99168649      * | 99168506,99168486,99168513
  [162]    chr10 99207868-99208156      * |          99207995,99208002
  [163]    chr10 99331363-99331730      * | 99331621,99331585,99331580
  [164]    chr10 99621632-99621961      * |          99621809,99621828
        SRR8315180 SRR8315181 SRR8315182         n
         <logical>  <logical>  <logical> <numeric>
    [1]       TRUE       TRUE       TRUE         3
    [2]       TRUE       TRUE       TRUE         3
    [3]       TRUE      FALSE       TRUE         2
    [4]       TRUE       TRUE       TRUE         3
    [5]      FALSE       TRUE       TRUE         2
    ...        ...        ...        ...       ...
  [160]       TRUE       TRUE       TRUE         3
  [161]       TRUE       TRUE       TRUE         3
  [162]      FALSE       TRUE       TRUE         2
  [163]       TRUE       TRUE       TRUE         3
  [164]      FALSE       TRUE       TRUE         2
  -------
  seqinfo: 1 sequence from hg19 genome

All/any/no mcols can be returned

makeConsensus

By default, the union of overlapping ranges will be returned
- between black lines

makeConsensus

Can also return only ranges covered by \(>p\times n\) samples
- between red lines
method = "coverage"

Consensus Peaks

To find peaks identified in 2/3 samples in either condition:

treatment_peaks <- list(
    E2 = makeConsensus(peaks[1:3], var = "centre", p = 2/3),
    E2DHT = makeConsensus(peaks[4:6], var = "centre", p = 2/3)
) %>% 
    lapply(select, centre) %>% 
    GRangesList()
treatment_peaks

GRangesList object of length 2:
$E2
GRanges object with 164 ranges and 1 metadata column:
        seqnames            ranges strand |                     centre
           <Rle>         <IRanges>  <Rle> |              <NumericList>
    [1]    chr10 43048195-43048529      * | 43048341,43048357,43048389
    [2]    chr10 43521739-43522260      * | 43522008,43522012,43522039
    [3]    chr10 43540042-43540390      * |          43540298,43540245
    [4]    chr10 43606238-43606573      * | 43606418,43606408,43606421
    [5]    chr10 43851214-43851989      * |          43851672,43851766
    ...      ...               ...    ... .                        ...
  [160]    chr10 99096784-99097428      * | 99097259,99097253,99097249
  [161]    chr10 99168353-99168649      * | 99168506,99168486,99168513
  [162]    chr10 99207868-99208156      * |          99207995,99208002
  [163]    chr10 99331363-99331730      * | 99331621,99331585,99331580
  [164]    chr10 99621632-99621961      * |          99621809,99621828
  -------
  seqinfo: 1 sequence from hg19 genome

$E2DHT
GRanges object with 160 ranges and 1 metadata column:
        seqnames            ranges strand |                     centre
           <Rle>         <IRanges>  <Rle> |              <NumericList>
    [1]    chr10 43048235-43048520      * | 43048383,43048373,43048388
    [2]    chr10 43521795-43522191      * | 43521992,43522008,43522026
    [3]    chr10 43540202-43540457      * | 43540318,43540364,43540361
    [4]    chr10 43606184-43606572      * |          43606426,43606411
    [5]    chr10 44092131-44092563      * | 44092370,44092321,44092385
    ...      ...               ...    ... .                        ...
  [156]    chr10 99168349-99168690      * |          99168488,99168465
  [157]    chr10 99207862-99208157      * | 99207990,99207997,99208044
  [158]    chr10 99331323-99331741      * | 99331567,99331605,99331594
  [159]    chr10 99340107-99340354      * |          99340208,99340289
  [160]    chr10 99621638-99621988      * |          99621822,99621733
  -------
  seqinfo: 1 sequence from hg19 genome

Consensus Peaks

plotOverlaps(treatment_peaks, set_col = treat_colours)

Produces a VennDiagram when n = 2

Consensus Peaks

union_peaks <- makeConsensus(treatment_peaks, var = "centre") %>% 
  mutate(
    centre =  round(vapply(centre, mean, numeric(1)), 0),
    centre = paste0(seqnames, ":", centre) %>% GRanges(seqinfo = sq)
  ) %>% 
  select(centre)
union_peaks

GRanges object with 188 ranges and 1 metadata column:
        seqnames            ranges strand |         centre
           <Rle>         <IRanges>  <Rle> |      <GRanges>
    [1]    chr10 43048195-43048529      * | chr10:43048372
    [2]    chr10 43521739-43522260      * | chr10:43522014
    [3]    chr10 43540042-43540457      * | chr10:43540317
    [4]    chr10 43606184-43606573      * | chr10:43606417
    [5]    chr10 43851214-43851989      * | chr10:43851719
    ...      ...               ...    ... .            ...
  [184]    chr10 99168349-99168690      * | chr10:99168492
  [185]    chr10 99207862-99208157      * | chr10:99208006
  [186]    chr10 99331323-99331741      * | chr10:99331592
  [187]    chr10 99340107-99340354      * | chr10:99340248
  [188]    chr10 99621632-99621988      * | chr10:99621798
  -------
  seqinfo: 1 sequence from hg19 genome

Peak centres are the mean from all replicates with an overlapping peak

Mapping Peaks

After finding consensus/union peaks:
- What type of regulatory regions?
- What genes are likely targets?

defineRegions() uses gene, transcript and exon-level information
- GRangesList of non-overlapping regions (promoter, intron etc)
- Map peaks to these regions using bestOverlap()

mapByFeature() assigns peaks/ranges to most likely genes
- Incorporates promoters, enhancers and long-range interactions

`defineRegions`

defineRegions(
  genes, transcripts, exons,
  promoter = c(2500, 500),
  upstream = 5000,
  intron = TRUE,
  proximal = 10000,
  simplify = FALSE,
  cols = c("gene_id", "gene_name", "transcript_id", "transcript_name"),
  ...
)

Unique Genomic Regions are defined hierarchically using a supplied gtf

Promoters
Upstream Promoters
Exons
Introns
Proximal Intergenic
Distal Intergenic

`defineRegions`

regions <- defineRegions(genes = gtf$gene, transcripts = gtf$transcript, exons = gtf$exon)
names(regions)

[1] "promoter"            "upstream_promoter"   "exon"               
[4] "intron"              "proximal_intergenic" "distal_intergenic"

regions$promoter

GRanges object with 3359 ranges and 5 metadata columns:
         seqnames              ranges strand |                region
            <Rle>           <IRanges>  <Rle> |           <character>
     [1]    chr10         61989-64988      * | Promoter (-2500/+500)
     [2]    chr10         66360-69392      * | Promoter (-2500/+500)
     [3]    chr10         88152-91151      * | Promoter (-2500/+500)
     [4]    chr10         94710-97961      * | Promoter (-2500/+500)
     [5]    chr10       119604-122603      * | Promoter (-2500/+500)
     ...      ...                 ...    ... .                   ...
  [3355]    chr10 135488075-135491074      * | Promoter (-2500/+500)
  [3356]    chr10 135491384-135494383      * | Promoter (-2500/+500)
  [3357]    chr10 135494684-135497683      * | Promoter (-2500/+500)
  [3358]    chr10 135503135-135506134      * | Promoter (-2500/+500)
  [3359]    chr10 135513071-135518323      * | Promoter (-2500/+500)
                                                                 gene_id
                                                         <CharacterList>
     [1]                                            ENSG00000260370.2_11
     [2]  ENSG00000260370.2_11,ENSG00000260370.2_11,ENSG00000260370.2_11
     [3]                                             ENSG00000237297.1_7
     [4] ENSG00000261456.6_7,ENSG00000261456.6_7,ENSG00000261456.6_7,...
     [5]                                             ENSG00000261456.6_7
     ...                                                             ...
  [3355]                                             ENSG00000276046.1_5
  [3356]                                             ENSG00000276904.1_5
  [3357]                                             ENSG00000278641.1_5
  [3358]                                               ENSG00000226880.1
  [3359] ENSG00000213147.3_3,ENSG00000261914.2_3,ENSG00000282875.1_8,...
                                               gene_name
                                         <CharacterList>
     [1]                                 ENSG00000260370
     [2] ENSG00000260370,ENSG00000260370,ENSG00000260370
     [3]                                 ENSG00000237297
     [4]                           TUBB8,TUBB8,TUBB8,...
     [5]                                           TUBB8
     ...                                             ...
  [3355]                                         DUX4L13
  [3356]                                         DUX4L14
  [3357]                                         DUX4L15
  [3358]                                    XX-2136C48.7
  [3359]         RPL23AP60,RPL23AP84,ENSG00000282875,...
                                                           transcript_id
                                                         <CharacterList>
     [1]                                             ENST00000562162.2_4
     [2]     ENST00000682078.1_1,ENST00000566940.2_4,ENST00000683798.1_1
     [3]                                             ENST00000416477.1_2
     [4] ENST00000568584.6_6,ENST00000568866.5_4,ENST00000561967.1_4,...
     [5]                                             ENST00000564130.2_4
     ...                                                             ...
  [3355]                                             ENST00000554103.2_2
  [3356]                                             ENST00000621227.1_2
  [3357]                                             ENST00000611059.1_2
  [3358]                                               ENST00000411632.1
  [3359] ENST00000450011.1_2,ENST00000571193.2_2,ENST00000635574.1_4,...
                                         transcript_name
                                         <CharacterList>
     [1]                                 ENST00000562162
     [2] ENST00000682078,ENST00000566940,ENST00000683798
     [3]                                 ENST00000416477
     [4]               TUBB8-206,TUBB8-207,TUBB8-201,...
     [5]                                       TUBB8-204
     ...                                             ...
  [3355]                                     DUX4L13-201
  [3356]                                     DUX4L14-201
  [3357]                                     DUX4L15-201
  [3358]                                XX-2136C48.7-001
  [3359] RPL23AP60-201,RPL23AP84-201,ENST00000635574,...
  -------
  seqinfo: 1 sequence from hg19 genome

`bestOverlap`

region_levels <- vapply(regions, function(x) x$region[1], character(1))
union_peaks$region <- bestOverlap(union_peaks, unlist(regions), var = "region")
union_peaks$region <- factor(union_peaks$region, levels = region_levels)
union_peaks

GRanges object with 188 ranges and 2 metadata columns:
        seqnames            ranges strand |         centre
           <Rle>         <IRanges>  <Rle> |      <GRanges>
    [1]    chr10 43048195-43048529      * | chr10:43048372
    [2]    chr10 43521739-43522260      * | chr10:43522014
    [3]    chr10 43540042-43540457      * | chr10:43540317
    [4]    chr10 43606184-43606573      * | chr10:43606417
    [5]    chr10 43851214-43851989      * | chr10:43851719
    ...      ...               ...    ... .            ...
  [184]    chr10 99168349-99168690      * | chr10:99168492
  [185]    chr10 99207862-99208157      * | chr10:99208006
  [186]    chr10 99331323-99331741      * | chr10:99331592
  [187]    chr10 99340107-99340354      * | chr10:99340248
  [188]    chr10 99621632-99621988      * | chr10:99621798
                          region
                        <factor>
    [1]    Promoter (-2500/+500)
    [2]    Intergenic (>10kb)   
    [3]    Intergenic (>10kb)   
    [4]    Intron               
    [5]    Intergenic (<10kb)   
    ...                      ...
  [184] Intron                  
  [185] Promoter (-2500/+500)   
  [186] Promoter (-2500/+500)   
  [187] Upstream Promoter (-5kb)
  [188] Upstream Promoter (-5kb)
  -------
  seqinfo: 1 sequence from hg19 genome

Now we have peaks annotated to regions \(\implies\) plotPie()

`plotPie`

plotPie(
  union_peaks, fill = "region", 
  min_p = 0.05, total_size = 4.5, cat_alpha = 0.7, cat_adj = 0.05,
  cat_glue = "{str_wrap(.data[[fill]], 15)}\nn = {n}\n({percent(p, 0.1)})"
) +  
  scale_fill_viridis_d(direction = -1)

Label text is highly customisable
- Uses glue syntax \(\implies\) can include function calls and line-breaks
- Totals available as n, proportions as p, categories as .data[[fill]]

Slices can also be scaled by width

`mapByFeature`

mapByFeature(
  gr,    # Ranges to be mapped
  genes, # Gene-annotations
  prom,  # Ranges annotated as promoters
  enh,   # Ranges annotated as enhancers
  gi,    # GInteractions object
  cols = c("gene_id", "gene_name", "symbol"), # Columns to return (if present)
  gr2prom = 0, gr2enh = 0, gr2gi = 0, # Distances between ranges & features
  gr2gene = 1e+05, prom2gene = 0, enh2gene = 1e+05, gi2gene = 0, ...
)

Ranges overlapping:

a promoter \(\implies\) assigned to that gene
an enhancer \(\implies\) all genes within a specified distance
a long-range interaction \(\implies\) all genes connected by the interaction
none-of the above:
1. all overlapping genes or
2. the nearest gene within a specified distance

`mapByFeature`

union_peaks <- mapByFeature(union_peaks, genes = gtf$gene, prom = regions$promoter)
union_peaks[1:5]

GRanges object with 5 ranges and 4 metadata columns:
      seqnames            ranges strand |         centre                region
         <Rle>         <IRanges>  <Rle> |      <GRanges>              <factor>
  [1]    chr10 43048195-43048529      * | chr10:43048372 Promoter (-2500/+500)
  [2]    chr10 43521739-43522260      * | chr10:43522014 Intergenic (>10kb)   
  [3]    chr10 43540042-43540457      * | chr10:43540317 Intergenic (>10kb)   
  [4]    chr10 43606184-43606573      * | chr10:43606417 Intron               
  [5]    chr10 43851214-43851989      * | chr10:43851719 Intergenic (<10kb)   
                                       gene_id       gene_name
                               <CharacterList> <CharacterList>
  [1] ENSG00000234420.9_10,ENSG00000270762.1_3 ZNF37BP,FXYD6P1
  [2]                        ENSG00000263795.1         MIR5100
  [3]                    ENSG00000165731.21_15             RET
  [4]                    ENSG00000165731.21_15             RET
  [5]                      ENSG00000285712.1_7 ENSG00000285712
  -------
  seqinfo: 1 sequence from hg19 genome

Differential Signal Analysis

Two Approaches to Differential Signal Analysis

Fixed-Width Analaysis
Sliding Windows

Highly comparable to DiffBind

Define a set of fixed-width ranges:
- Centres from importPeaks() \(\implies\) makeConsensus()
- Resize to fixed-width
Count Reads using csaw::regionCounts() + BamFileList
- Returns a RangedSummarizedExperiment
Perform analysis using fitAssayDiff()

Extends csaw approaches

Define window size & step
Count reads using csaw::windowCounts() + BamFileList
- Returns a RangedSummarizedExperiment
Filter windows for those likely containing signal: dualFilter()
- Uses a set of ‘guide ranges’ to set filtering thresholds
Fit windows using fitAssayDiff()
Merge neighbouring windows for final results

Differential Signal Analysis

Visualisation functions after counting reads
- plotAssayPCA, plotAssayRle, plotAssayDensities

Differential Signal Analysis: fitAssayDiff()
- edgeR::glmQLF() or limma-trend

Sliding window analysis \(\implies\) merge neighboring windows
- mergeByHMP() uses harmonic-mean p (Wilson 2019)
- mergeByCol(), mergeBySig(): wraps csaw methods
- All return the relevant underlying windows (keyval_range)

Fixed-Width Windows

library(Rsamtools)
library(csaw)
## Centre & resize the peaks
centred_peaks <- union_peaks %>% 
  mutate(union_peak = granges(.)) %>%  # Add the original ranges to mcols
  colToRanges(var = "centre") %>% # Move the centred range to the 'backbone'
  resize(width = 400, fix = "center") # Resize to a fixed with
## Define the BamFileList
er_bfl <- file.path("data", "ER", paste0(samples$ER$accession, ".bam")) %>% 
  BamFileList() %>% 
  setNames(samples$ER$accession)
## Count Reads
er_se <- regionCounts(er_bfl, centred_peaks, ext = 200)
seqinfo(er_se) <- sq
## Re-assign union peaks to core ranges after counting
rowRanges(er_se) <- rowRanges(er_se) %>% 
  colToRanges(var = "union_peak") %>% 
  select(region, starts_with("gene"))
## Add all sample annotations
colData(er_se) <- colData(er_se) %>% 
  as_tibble(rownames = "accession") %>% 
  left_join(samples$ER) %>% 
  as("DataFrame")

plotAssayDensities(er_se, trans = "log1p", colour = "treatment") +
  scale_colour_manual(values = treat_colours)

plotAssayPCA(er_se, trans = "log1p", colour = "treatment", label = "accession") +
  scale_colour_manual(values = treat_colours)

RLE Plots

RLE By Sample
RLE By Treatment

plotAssayRle(er_se, trans = "log1p", fill = "treatment") +
  scale_fill_manual(values = treat_colours)

plotAssayRle(er_se, trans = "log1p", fill = "treatment", by_x = "treatment") +
  scale_fill_manual(values = treat_colours)

Sliding Windows

## Define the BamFile List INCLUDING the Input
acc <- c(samples$H3K27ac$accession, samples$H3K27ac$input) %>% 
  unique()
h3k_bfl <- file.path("data", "H3K27ac", paste0(acc, ".bam")) %>% 
  BamFileList() %>% 
  setNames(acc)
## Setup the parameters for windows & counts
rp <- readParam(pe = "none", restrict = "chr10", discard = bg_list)
win_size <- 150
win_step <- 50
## Count reads for all windows
wincounts <- windowCounts(
  bam.files = h3k_bfl,
  spacing = win_step, width = win_size, ext = 200,
  filter = length(h3k_bfl), param = rp
)
dim(wincounts)

[1] 467904      7

Filter Sliding Windows

Retain windows which are most likely to contain signal

## Load in the pre-defined ranges
guide_ranges <- file.path("data", "H3K27ac", "H3K27ac_chr10.bed") %>% 
  import.bed(seqinfo = sq)
## Filter using the guide ranges
filtcounts <- dualFilter(
  x = wincounts[, samples$H3K27ac$accession],
  bg  =  wincounts[, unique(samples$H3K27ac$input)],
  ref = guide_ranges, q = 0.6
)
## Update the colData
colData(filtcounts) <- colData(filtcounts) %>% 
  as_tibble(rownames = "accession") %>% 
  left_join(samples$H3K27ac) %>% 
  as("DataFrame")
dim(filtcounts)

[1] 19017     6

assays(filtcounts)

List of length 2
names(2): counts logCPM

`fitAssayDiff()`

fitAssayDiff(
  x, assay = "counts",
  design = NULL, coef = NULL,
  lib.size = "totals",
  method = c("qlf", "lt"),
  norm = c("none", "TMM", "RLE", "TMMwsp", "upperquartile"),
  groups = NULL,
  fc = 1, lfc = log2(fc),
  asRanges = FALSE,
  offset = NULL,
  null = c("interval", "worst.case"),
  weighted = FALSE,
  ...
)

edgeR::glmQLFit is default method (Chen, Lun, and Smyth 2016)
limma-trend for logCPM values (Law et al. 2014)
Can apply range-based H₀ (McCarthy and Smyth 2009)
Optionally normalise within treatment groups (e.g. groups = "treat")
Library sizes can be set as colSums (lib.size = NULL)

Fixed-Width Windows

## Setup the design matrix
X <- model.matrix(~treatment, data = colData(er_se))
## Fit the model
er_results <- fitAssayDiff(
  er_se, design = X, norm = "TMM", asRanges = TRUE, fc = 1.2
) %>% 
  addDiffStatus()
## Print the significant ranges
er_results %>% 
  as_tibble() %>% 
  dplyr::filter(FDR < 0.05) %>% 
  dplyr::select(-gene_id) %>% 
  mutate(
    gene_name = vapply(gene_name, paste, character(1), collapse = "; ")
  ) %>% 
  arrange(PValue)

# A tibble: 7 × 8
  range                   region  gene_name logFC logCPM   PValue     FDR status
  <chr>                   <fct>   <chr>     <dbl>  <dbl>    <dbl>   <dbl> <fct> 
1 chr10:81101906-81102928 Upstre… PPIF      1.91    8.03 4.43e-11 8.34e-9 Incre…
2 chr10:79629641-79630271 Promot… DLG5      0.967   8.23 4.20e- 6 3.95e-4 Incre…
3 chr10:89407752-89408138 Intron  ENSG0000… 1.71    6.20 1.08e- 5 6.79e-4 Incre…
4 chr10:52233596-52233998 Exon    SGMS1; E… 1.13    6.70 2.37e- 4 1.11e-2 Incre…
5 chr10:51547205-51547782 Promot… MSMB      0.665   7.44 1.12e- 3 4.22e-2 Incre…
6 chr10:79266991-79267392 Intron  KCNMA1    1.24    5.56 1.58e- 3 4.52e-2 Incre…
7 chr10:71879941-71880280 Promot… AIFM2     1.20    5.68 1.68e- 3 4.52e-2 Incre…

plotSplitDonut

## Define inner & outer palettes
region_colours <- hcl.colors(length(region_levels), "viridis", rev = TRUE)
names(region_colours) <- region_levels
status_colours <- c(
  Increased = "red3", Decreased = "royalblue3", Unchanged = "grey40",
  Undetected = "grey70"
)
plotSplitDonut(
  er_results, inner = "region", outer = "status",
  inner_palette = region_colours, outer_palette = status_colours,
  inner_glue = "{str_wrap(.data[[inner]],15)}\nn = {n}", inner_label_alpha = 0.8,
  outer_glue = "{str_wrap(.data[[inner]],15)}\nER {.data[[outer]]}\nn = {n}",
  outer_max_p = 0.03, outer_min_p = 0,
  explode_outer = "Increased", explode_r = 0.2
)

Can set inner and outer palettes separately
Labels are customisable using glue
Explode segments based regex queries

Sliding Windows

Involves the additional step of merging sliding windows

X <- model.matrix(~treatment, data = colData(filtcounts))
h3k_all <- fitAssayDiff(filtcounts, norm = "TMM", design = X, asRanges = TRUE)
h3k_results <- mergeByHMP(
  h3k_all, min_win = 2, merge_within = win_size + 1, keyval = "merged"
) %>% 
  addDiffStatus()
arrange(h3k_results, hmp)[1:5]

GRanges object with 5 ranges and 9 metadata columns:
      seqnames            ranges strand | n_windows      n_up    n_down
         <Rle>         <IRanges>  <Rle> | <integer> <integer> <integer>
  [1]    chr10 79256601-79258700      * |        35         4         0
  [2]    chr10 43689201-43690150      * |        17         1         0
  [3]    chr10 74008151-74008750      * |        10         2         0
  [4]    chr10 63859401-63859750      * |         5         1         0
  [5]    chr10 79623301-79635950      * |       251         7         0
                 keyval_range    logCPM     logFC         hmp     hmp_fdr
                    <GRanges> <numeric> <numeric>   <numeric>   <numeric>
  [1] chr10:79257751-79258050   7.12950  1.693320 5.67542e-15 3.64930e-12
  [2] chr10:43689651-43689800   6.69278  1.866274 2.74260e-12 8.81747e-10
  [3] chr10:74008301-74008500   6.54608  1.735779 2.67001e-10 5.72273e-08
  [4] chr10:63859601-63859750   6.03721  1.848069 5.65428e-08 7.76644e-06
  [5] chr10:79630151-79630650   7.87510  0.905969 6.03922e-08 7.76644e-06
         status
       <factor>
  [1] Increased
  [2] Increased
  [3] Increased
  [4] Increased
  [5] Increased
  -------
  seqinfo: 1 sequence from hg19 genome

Map Results

Map ranges after analysis

h3k_results <- h3k_results %>% 
  mutate(
    region = bestOverlap(.$keyval_range, unlist(regions), var = "region")
  ) %>% 
  mapByFeature(genes = gtf$gene, prom = regions$promoter)
arrange(h3k_results, hmp)[1:5]

GRanges object with 5 ranges and 12 metadata columns:
      seqnames            ranges strand | n_windows      n_up    n_down
         <Rle>         <IRanges>  <Rle> | <integer> <integer> <integer>
  [1]    chr10 79256601-79258700      * |        35         4         0
  [2]    chr10 43689201-43690150      * |        17         1         0
  [3]    chr10 74008151-74008750      * |        10         2         0
  [4]    chr10 63859401-63859750      * |         5         1         0
  [5]    chr10 79623301-79635950      * |       251         7         0
                 keyval_range    logCPM     logFC         hmp     hmp_fdr
                    <GRanges> <numeric> <numeric>   <numeric>   <numeric>
  [1] chr10:79257751-79258050   7.12950  1.693320 5.67542e-15 3.64930e-12
  [2] chr10:43689651-43689800   6.69278  1.866274 2.74260e-12 8.81747e-10
  [3] chr10:74008301-74008500   6.54608  1.735779 2.67001e-10 5.72273e-08
  [4] chr10:63859601-63859750   6.03721  1.848069 5.65428e-08 7.76644e-06
  [5] chr10:79630151-79630650   7.87510  0.905969 6.03922e-08 7.76644e-06
         status                region                                   gene_id
       <factor>           <character>                           <CharacterList>
  [1] Increased                Intron                     ENSG00000156113.25_17
  [2] Increased    Intergenic (<10kb)                     ENSG00000198915.12_14
  [3] Increased    Intergenic (>10kb)                       ENSG00000166295.9_6
  [4] Increased    Intergenic (<10kb)                     ENSG00000150347.17_12
  [5] Increased Promoter (-2500/+500) ENSG00000204049.1_8,ENSG00000151208.17_11
                 gene_name
           <CharacterList>
  [1]               KCNMA1
  [2]             RASGEF1A
  [3]              ANAPC16
  [4]               ARID5B
  [5] ENSG00000204049,DLG5
  -------
  seqinfo: 1 sequence from hg19 genome

Comparing Results

Comparing Two Targets

Check where change is found in both/either/neither
- mapGrlCols()

Plot combined status
- plotSplitDonut()

Compare binding patterns:
- Both signal (logCPM) and change (logFC)
- plotPairwise()

Inspect Genomic Regions of interest
- plotHFGC()

Comparing Two Targets

Comparison functions utilise GRangesList objects
Require shared column names

all_results <- GRangesList(
  ER = er_results,
  H3K27ac = h3k_results %>% 
    select(
      region, starts_with("gene"), starts_with("log"), 
      PValue = hmp, FDR = hmp_fdr, status
    )
)

mapGrlCols

Columns from each element can be mapped
- Columns can be combined and collapsed into a single column
Viable for a named GRangesList of any size

all_results %>% 
  mapGrlCols(var = c("logFC", "status"), collapse = "gene_name")

GRanges object with 741 ranges and 5 metadata columns:
        seqnames            ranges strand |  ER_logFC H3K27ac_logFC ER_status
           <Rle>         <IRanges>  <Rle> | <numeric>     <numeric>  <factor>
    [1]    chr10 42862951-42863350      * |        NA     0.3345653 NA       
    [2]    chr10 43047801-43048800      * | -0.301091    -0.0780895 Unchanged
    [3]    chr10 43047801-43048800      * | -0.301091    -0.2619553 Unchanged
    [4]    chr10 43132901-43134800      * |        NA     0.1350775 NA       
    [5]    chr10 43135701-43137250      * |        NA     0.0791231 NA       
    ...      ...               ...    ... .       ...           ...       ...
  [737]    chr10 99496401-99497850      * |        NA     0.0551099 NA       
  [738]    chr10 99608301-99610050      * |        NA    -0.5008959 NA       
  [739]    chr10 99621632-99621988      * |  0.115139            NA Unchanged
  [740]    chr10 99628751-99629400      * |        NA    -0.2298606 NA       
  [741]    chr10 99894101-99895400      * |        NA    -0.1143540 NA       
        H3K27ac_status              gene_name
              <factor>            <character>
    [1]      Unchanged        ENSG00000291065
    [2]      Unchanged       FXYD6P1; ZNF37BP
    [3]      Unchanged       FXYD6P1; ZNF37BP
    [4]      Unchanged ENSG00000285884; ZNF..
    [5]      Unchanged ENSG00000285884; ZNF..
    ...            ...                    ...
  [737]      Unchanged                ZFYVE27
  [738]      Unchanged    GOLGA7B; GOLGA7B-DT
  [739]      NA                       GOLGA7B
  [740]      Unchanged        CRTAC1; GOLGA7B
  [741]      Unchanged                R3HCC1L
  -------
  seqinfo: 1 sequence from hg19 genome

mapGrlCols

Columns from each element can be mapped
- Columns can be combined and collapsed into a single column
Makes finding sites changed for both comparisons simple

all_results %>% 
  mapGrlCols(var = c("logFC", "status"), collapse = "gene_name") %>% 
  filter(
    str_detect(ER_status, "(In|De)creased"), str_detect(H3K27ac_status, "(In|De)creased")
  )

GRanges object with 3 ranges and 5 metadata columns:
      seqnames            ranges strand |  ER_logFC H3K27ac_logFC ER_status
         <Rle>         <IRanges>  <Rle> | <numeric>     <numeric>  <factor>
  [1]    chr10 71879101-71881150      * |   1.20340      0.810052 Increased
  [2]    chr10 79623301-79635950      * |   0.96745      0.905969 Increased
  [3]    chr10 81101651-81103150      * |   1.90628      1.577851 Increased
      H3K27ac_status             gene_name
            <factor>           <character>
  [1]      Increased                 AIFM2
  [2]      Increased DLG5; ENSG00000204049
  [3]      Increased                  PPIF
  -------
  seqinfo: 1 sequence from hg19 genome

plotPairwise: logCPM

plotPairwise(all_results, var = "logCPM")

plotPairwise: logFC

plotPairwise(
  all_results, var = "logFC", colour = "status", label = "gene_name"
) +
  scale_colour_manual(
    values = c("lightskyblue", "palevioletred", "grey", "darkred")
  ) +
  scale_fill_manual(
    values = c(
      setNames(status_colours, paste("ER", names(status_colours))),
      setNames(status_colours, paste("H3K27ac", names(status_colours)))
    )
  )

Colouring by status can be informative
The furthest point from zero (per group) can be labelled
Point colours and boxplot fills can be set using ggplot2 methods

plotHFGC

Plots multiple tracks using Gviz (Hahne and Ivanek 2016)
- HiC (or other interactions)
- Features of any type
- Gene models
- Coverage

All tracks are optional
Cytogenetic bands for GRCh37/38 are included in the package

data("grch37.cytobands")

plotHFGC

HiC tracks

GInteractions object

Feature tracks

A named GRangesList \(\implies\) single track coloured by element
- Output from defineRegions()
A named list of (named) GRangesList objects \(\implies\) multiple tracks

Gene Tracks

GRanges object for a single track
- Easily formed from exon-level annotations
Named GRangesList will draw multiple tracks (Up, Down etc.)

plotHFGC

Coverage tracks

A BigWigFileList \(\implies\) each file as a separate track
A list of BigwigFileList objects
- Will overlay files within each BigWigFileList
- Each list element on a separate track

Can also annotate coverage tracks (e.g. Increased, Decreased etc)

plotHFGC

gr <- all_results %>% 
  mapGrlCols(var = "status", collapse = "gene_name") %>% 
  filter(
    ER_status %in% c("Increased", "Decreased"), 
    H3K27ac_status %in% c("Increased", "Decreased"), 
    gene_name == "PPIF"
  )
plotHFGC(
  gr, cytobands = grch37.cytobands,
  features = regions, featcol = region_colours,
  genes = gene_models,
  coverage = cov_bwfl, linecol = line_col, 
  annotation = cov_annot, annotcol = status_colours,
  zoom = 20
)

Once all objects defined:
- Inspect multiple ranges quickly & easily
- Zoom in/out as needed

References

Chen, Yunshun, Aaron A T Lun, and Gordon K Smyth. 2016. “From Reads to Genes to Pathways: Differential Expression Analysis of RNA-Seq Experiments Using Rsubread and the edgeR Quasi-Likelihood Pipeline.” F1000Research 5: 1438. https://doi.org/10.12688/f1000research.8987.2.

Hahne, Florian, and Robert Ivanek. 2016. “Statistical Genomics: Methods and Protocols.” In, edited by Ewy Mathé and Sean Davis, 335–51. New York, NY: Springer New York. https://doi.org/10.1007/978-1-4939-3578-9_16.

Hickey, Theresa E, Luke A Selth, Kee Ming Chia, Geraldine Laven-Law, Heloisa H Milioli, Daniel Roden, Shalini Jindal, et al. 2021. “The Androgen Receptor Is a Tumor Suppressor in Estrogen Receptor-Positive Breast Cancer.” Nat. Med. 27 (2): 310–20.

Law, Charity W, Yunshun Chen, Wei Shi, and Gordon K Smyth. 2014. “Voom: Precision Weights Unlock Linear Model Analysis Tools for RNA-seq Read Counts.” Genome Biol. 15 (2): R29.

Lee, Stuart, Cook, Dianne, Lawrence, and Michael. 2019. “Plyranges: A Grammar of Genomic Data Transformation.” Genome Biol. 20 (1): 4. http://dx.doi.org/10.1186/s13059-018-1597-8.

Lun, Aaron T L, and Gordon K Smyth. 2016. “Csaw: A Bioconductor Package for Differential Binding Analysis of ChIP-Seq Data Using Sliding Windows.” Nucleic Acids Res. 44 (5): e45.

McCarthy, Davis J, and Gordon K Smyth. 2009. “Testing Significance Relative to a Fold-Change Threshold Is a TREAT.” Bioinformatics 25 (6): 765–71.

Ross-Innes, Caryn S., Rory Stark, Andrew E. Teschendorff, Kelly A. Holmes, H. Raza Ali, Mark J. Dunning, Gordon D. Brown, et al. 2012. “Differential Oestrogen Receptor Binding Is Associated with Clinical Outcome in Breast Cancer.” Nature 481: –4. http://www.nature.com/nature/journal/v481/n7381/full/nature10730.html.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Wilson, Daniel J. 2019. “The Harmonic Mean p-Value for Combining Dependent Tests.” Proc. Natl. Acad. Sci. U. S. A. 116 (4): 1195–1200.

extraChIPs

Introduction

Who Am I?

Why extraChIPs?

Why extraChIPs?

Key Functions

Core Infrastructure

Working With Peaks/GRanges

Differential Signal Analysis

Visualisations

Core Infrastructure

*MC Functions

reduce Vs. reduceMC

reduce Vs. reduceMC

tibble Coercion

Coercing tibble objects back to GRanges

colToRanges

Working With Peaks

Demonstration Data 1

Data Directory Contents

Additional Objects

Working With Peaks

Importing Peaks

Importing Peaks

Comparing Replicates

makeConsensus

makeConsensus

makeConsensus

Consensus Peaks

Consensus Peaks

Consensus Peaks

Mapping Peaks

defineRegions

defineRegions

bestOverlap

plotPie

plotPie

mapByFeature

mapByFeature

Differential Signal Analysis

Two Approaches to Differential Signal Analysis

Differential Signal Analysis

Fixed-Width Windows

Visualising Counts

RLE Plots

Sliding Windows

Filter Sliding Windows

fitAssayDiff()

Fixed-Width Windows

plotSplitDonut

plotSplitDonut

Sliding Windows

Map Results

Comparing Results

Comparing Two Targets

Comparing Two Targets

mapGrlCols

mapGrlCols

plotPairwise: logCPM

plotPairwise: logFC

plotPairwise: logFC

plotHFGC

plotHFGC

HiC tracks

Feature tracks

Gene Tracks

plotHFGC

Coverage tracks

plotHFGC

plotHFGC

References

Why `extraChIPs`?

Why `extraChIPs`?

`reduce` Vs. `reduceMC`

`reduce` Vs. `reduceMC`

`tibble` Coercion

Coercing `tibble` objects back to `GRanges`

`colToRanges`

Demonstration Data ¹

`defineRegions`

`defineRegions`

`bestOverlap`

`plotPie`

`plotPie`

`mapByFeature`

`mapByFeature`

`fitAssayDiff()`