Plot the Sequence Length Distribution across one or more FASTQC reports

plotSeqLengthDistn(
  x,
  usePlotly = FALSE,
  labels,
  pattern = ".(fast|fq|bam).*",
  ...
)

# S4 method for class 'ANY'
plotSeqLengthDistn(
  x,
  usePlotly = FALSE,
  labels,
  pattern = ".(fast|fq|bam).*",
  ...
)

# S4 method for class 'character'
plotSeqLengthDistn(
  x,
  usePlotly = FALSE,
  labels,
  pattern = ".(fast|fq|bam).*",
  ...
)

# S4 method for class 'FastqcData'
plotSeqLengthDistn(
  x,
  usePlotly = FALSE,
  labels,
  pattern = ".(fast|fq|bam).*",
  counts = TRUE,
  plotType = c("line", "cdf"),
  expand.x = c(0, 0.2, 0, 0.2),
  plotlyLegend = FALSE,
  colour = "red",
  ...
)

# S4 method for class 'FastqcDataList'
plotSeqLengthDistn(
  x,
  usePlotly = FALSE,
  labels,
  pattern = ".(fast|fq|bam).*",
  counts = FALSE,
  plotType = c("heatmap", "line", "cdf"),
  cluster = FALSE,
  dendrogram = FALSE,
  heat_w = 8,
  pwfCols,
  showPwf = TRUE,
  scaleFill = NULL,
  scaleColour = NULL,
  heatCol = hcl.colors(50, "inferno"),
  plotlyLegend = FALSE,
  ...
)

Arguments

x

Can be a FastqcData, FastqcDataList or file paths

usePlotly

logical. Output as ggplot2 or plotly object.

labels

An optional named vector of labels for the file names. All filenames must be present in the names.

pattern

Regex to remove from the end of any filenames

...

Used to pass additional attributes to theme()

counts

logical Should distributions be shown as counts or frequencies (percentages)

plotType

character. Can only take the values plotType = "heatmap" plotType = "line" or plotType = "cdf"

expand.x

Output from expansion() or numeric vector of length 4. Passed to scale_x_discrete

plotlyLegend

logical(1) Show legend for interactive line plots

colour

Line colour

cluster

logical default FALSE. If set to TRUE, fastqc data will be clustered using hierarchical clustering

dendrogram

logical redundant if cluster and usePlotly are FALSE. If both cluster and dendrogram are specified as TRUE then the dendrogram will be displayed.

heat_w

Relative width of any heatmap plot components

pwfCols

Object of class PwfCols() to give colours for pass, warning, and fail values in plot

showPwf

logical(1) Show PASS/WARN/FAIL status

scaleFill, scaleColour

Optional ggplot scale objects

heatCol

The colour scheme for the heatmap

Value

A standard ggplot2 object, or an interactive plotly object

Details

This extracts the Sequence Length Distribution from the supplied object and generates a ggplot2 object, with a set of minimal defaults. The output of this function can be further modified using the standard ggplot2 methods.

A cdf plot can also be generated to provide guidance for minimum read length in some NGS workflows, by setting plotType = "cdf". If all libraries have reads of identical lengths, these plots may be less informative.

An alternative interactive plot is available by setting the argument usePlotly = TRUE.

Examples


# Get the files included with the package
packageDir <- system.file("extdata", package = "ngsReports")
fl <- list.files(packageDir, pattern = "fastqc.zip", full.names = TRUE)

# Load the FASTQC data as a FastqcDataList object
fdl <- FastqcDataList(fl)

# Plot as a frequency plot using lines
plotSeqLengthDistn(fdl)


# Or plot the cdf
plotSeqLengthDistn(fdl, plotType = "cdf")
#> `geom_line()`: Each group consists of only one observation.
#>  Do you need to adjust the group aesthetic?