Chapter 3 Quick Start Guide
3.1 Install Snakemake
You will need a snakemake installation to begin.
Please see here for help setting this up.
If you are running the pipeline on an HPC and are unsure, please consult with your HPC support team about setting up a snakemake profile on your specific cluster.
3.2 Create the Directory Structure
- Create a new
githubrepository on your account by going to the github template repository - Download your new repository to your local server or HPC using
git clone <myrepository> - Place your bam files in the subdirectory
data/bamordata/alignedas described in section 4.2 - Edit
samples.tsvin theconfigdirectory as described in section 4.3 - Ensure you have the blacklist as a bed file and annotations as gtf
- Modify any parameters in
config/config.yml
3.3 Run the Pipeline
3.3.1 Run On A Local Server
To run using 16 cores without any queuing system (e.g. on a local machine), enter the following
snakemake -p --use-conda --notemp --keep-going --rerun-triggers mtime --cores 16
3.3.2 Run On An HPC
Please consult with your local support team for their advice running a snakemake workflow.
In essence, the above command will need to be provided to your queuing system through the preferred strategy.
The snakemake profile required will generally be stable across all workflows but may require expertise from the technical support team.
3.4 Tips And Tricks
3.4.1 Removing Large files
Some large files, such as R Environments and BedGraph files are marked as temp files internally and these can be removed after completion of the workflow using
snakemake --delete-temp-output --cores 1
3.4.3 Running Restricted Sections of the Worklow
Snakemake has the capacity to run a workflow up until a certain point and this can be easily done using the argument --until and specifying the stage you wish to terminate the workflow at. For example, the argument --until compile_macs2_summary_html would only run the workflow until the macs2 summaries are compiled, which may be preferable for checking QC before proceeding to differential expression and pairwise comparisons.