R/AllGenerics.R
, R/getProfileData.R
getProfileData-methods.Rd
Get coverage Profile Data surrounding specified ranges
getProfileData(x, gr, ...)
# S4 method for class 'BigWigFile,GenomicRanges'
getProfileData(
x,
gr,
upstream = 2500,
downstream = upstream,
bins = 100,
mean_mode = "w0",
log = TRUE,
offset = 1,
n_max = Inf,
...
)
# S4 method for class 'BigWigFileList,GenomicRanges'
getProfileData(
x,
gr,
upstream = 2500,
downstream = upstream,
bins = 100,
mean_mode = "w0",
log = TRUE,
offset = 1,
BPPARAM = SerialParam(),
...
)
# S4 method for class 'character,GenomicRanges'
getProfileData(
x,
gr,
upstream = 2500,
downstream = upstream,
bins = 100,
mean_mode = "w0",
log = TRUE,
offset = 1,
...
)
A BigWigFile or BigWiFileList
A GRanges object
Passed to normalizeToMatrix
The distance to extend upstream from the centre of each
range within gr
The distance to extend downstream from the centre of each
range within gr
The total number of bins to break the extended ranges into
The method used for calculating the score for each bin. See normalizeToMatrix for details
logical(1) Should the returned values be log2-transformed
Value added to data if log-transforming. Ignored otherwise
Upper limit on the number of ranges to return profile data for. By default, no limit will be applied .
Passed internally to bplapply
GRanges or GrangesList with column profile_data, as described above
This will take all provided ranges and set as identical width ranges, extending by the specified amount both up and downstream of the centre of the provided ranges. By default, the ranges extensions are symmetrical and only the upstream range needs to be specified, however this parameterisation allows for non-symmetrical ranges to be generated.
These uniform width ranges will then be used to extract the value contained in the score field from one or more BigWigFiles. Uniform width ranges are then broken into bins of equal width and the average score found within each bin.
The binned profiles are returned as a DataFrameList called profile_data
as
a column within the resized GRanges object.
Column names in each DataFrame are score
, position
and bp
.
If passing a BigWigFileList, profiles will be obtained in series by
default. To run in parallel pass a MulticoreParam object
to the BPPARAM
argument.
bw <- system.file("tests", "test.bw", package = "rtracklayer")
gr <- GRanges("chr2:1000")
pd <- getProfileData(bw, gr, upstream = 500, bins = 10)
pd
#> GRanges object with 1 range and 1 metadata column:
#> seqnames ranges strand | profile_data
#> <Rle> <IRanges> <Rle> | <SplitDataFrameList>
#> [1] chr2 500-1499 * | -2.00:u1:-450,-1.02:u2:-350,-1.00:u3:-250
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
pd$profile_data
#> SplitDataFrameList of length 1
#> [[1]]
#> DataFrame with 10 rows and 3 columns
#> score position bp
#> <numeric> <factor> <numeric>
#> 1 -2.00000000 u1 -450
#> 2 -1.02000000 u2 -350
#> 3 -1.00000000 u3 -250
#> 4 -1.00000000 u4 -150
#> 5 -0.42673675 u5 -50
#> 6 -0.41503750 d1 50
#> 7 -0.41503750 d2 150
#> 8 -0.00830075 d3 250
#> 9 0.00000000 d4 350
#> 10 0.00000000 d5 450
#>