run.bwamem.sh

Description

Map single-end or paired-end reads of a batch of samples in parallel using BWA MEM (Li 2013), samtools (Li et al. 2009) and GNU parallel (Tange 2011).

Usage

run.bwamem.sh -s <file> -r <file> -e <string> -d <directory> -T <positive integer> -Q <positive integer> \ 
              -o <directory> -t <positive integer>

Dependencies

bwa
sambamba
java
python
picard-tools
r

Arguments

# Required
-s        File with sample names (without header or '>')
-r        File with reference sequences (FASTA format)
-e        Sample file extension(s). E.g. '.fasta' or '.trim.fq.gz' [unpaired] or '.trim1.fastq,.trim2.fastq' [paired].
          The program then interprets if data is unpaired or paired. Separate file extensions of file pairs with a ','.

# Optional [DEFAULT]
-d  [pwd] Path to input reads.
-T  [10]  Minimum bwa mem alignment score, passed to -T parameter of bwa mem.
-Q  [20]  Minimum mapping quality, used to filter by the fifth field / MAPQ column in BAM files. Must be Q >= T.
-o  [pwd] Path to output directory. A folder will be created if it does not exist.
-t  [3]   Number of samples processed in parallel.
          Can be between 1 (uses ${cpu} CPU cores in total) and 6 (uses 6*${cpu} CPU cores in total, cpu=4).

Details

-s  The sample file must contain the sample basenames (i.e., sample name without file extensions specified in -e).
    Use a single line for paired reads in two files, e.g. the sample basename for files 'SH598_S16.trim1.fastq.gz'
    and 'SH598_S16.trim2.fastq.gz' would be 'SH598_S16' if -e is set to '.trim1.fastq.gz,.trim2.fastq.gz'.

-Q  A value above 0 will filter reads with multiple mappings. Only reads passing the -Q filter will be written to
    the output directory, after removing PCR duplicates using the picard MarkDuplicates tool.

Value

An output directory with a subfolder ${name} for each sample, containing these main files.

- bwa.Q${Q}.log                               Log file.
- ${name}.bwa-mem.sorted.Q${Q}.nodup.bam      BAM file with reads mapped with high quality (MAIN RESULT).
- ${name}.bwa-mem.sorted.Q${Q}.nodup.bam.bai  BAI index file.
- ${name}.flagstats.txt                       BWA flagstats of all mapped reads.
- ${name}.Q${Q}.nodup.flagstats.txt           BWA flagstats of reads mapped with high quality.
- ${name}.dupstats.txt                        PCR duplicates statistics.

Examples

run.bwamem.sh -s samples.txt -r ref.fasta -e '.trim1.fastq.gz,.trim2.fastq.gz' -T 10 -Q 20 -d NS-run1_trimmed -t 4

Authors

Simon Crameri (ETHZ) and Stefan Zoller (GDC)

References

Li, H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997v1302.
Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, and G. P. D. P. Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078-2079.
Tange, O. 2011. GNU parallel—The command-line power tool. Login: The USENIX Magazine 36:42-47.

CaptureAl v0.1 Documentation

run.bwamem.sh

Description

Usage

Dependencies

Arguments

Details

Value

Examples

Authors

References

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally