Chapter 3 Quick Start Guide
3.1 Install Snakemake
You will need a snakemake
installation to begin.
Please see here for help setting this up.
If you are running the pipeline on an HPC and are unsure, please consult with your HPC support team about setting up a snakemake
profile on your specific cluster.
3.2 Create the Directory Structure
- Create a new
github
repository on your account by going to the github template repository - Download your new repository to your local server or HPC using
git clone <myrepository>
- Place your bam files in the subdirectory
data/bam
ordata/aligned
as described in section 4.2 - Edit
samples.tsv
in theconfig
directory as described in section 4.3 - Ensure you have the blacklist as a bed file and annotations as gtf
- Modify any parameters in
config/config.yml
3.3 Run the Pipeline
3.3.1 Run On A Local Server
To run using 16 cores without any queuing system (e.g. on a local machine), enter the following
snakemake -p --use-conda --notemp --keep-going --rerun-triggers mtime --cores 16
3.3.2 Run On An HPC
Please consult with your local support team for their advice running a snakemake
workflow.
In essence, the above command will need to be provided to your queuing system through the preferred strategy.
The snakemake
profile required will generally be stable across all workflows but may require expertise from the technical support team.
3.4 Tips And Tricks
3.4.1 Removing Large files
Some large files, such as R Environments and BedGraph files are marked as temp
files internally and these can be removed after completion of the workflow using
snakemake --delete-temp-output --cores 1
3.4.3 Running Restricted Sections of the Worklow
Snakemake has the capacity to run a workflow up until a certain point and this can be easily done using the argument --until
and specifying the stage you wish to terminate the workflow at. For example, the argument --until compile_macs2_summary_html
would only run the workflow until the macs2 summaries are compiled, which may be preferable for checking QC before proceeding to differential expression and pairwise comparisons.