Provided by: art-nextgen-simulation-tools_20160605+dfsg-4build2_amd64
NAME
art_454 - Simulation of 454 Pyrosequencing
DESCRIPTION
ART is a set of simulation tools to generate synthetic next-generation sequencing reads. ART simulates sequencing reads by mimicking real sequencing process with empirical error models or quality profiles summarized from large recalibrated sequencing data. art_454 can be used for Simulation of 454 Pyrosequencing.
USAGE
SINGLE-END SIMULATION art_454 [-s] [-a ] [-t] [-r rand_seed] [ -p read_profile ] [ -c num_flow_cycles ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <FOLD_COVERAGE> PAIRED-END SIMULATION art_454 [-s] [-a ] [-t] [-r rand_seed] [ -p read_profile ] [ -c num_flow_cycles ] <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <FOLD_COVERAGE> <MEAN_FRAG_LEN> <STD_DEV> AMPLICON SEQUENCING SIMULATION art_454 [-s] [-a ] [-t] [-r rand_seed] [ -p read_profile ] [ -c num_flow_cycles ] <-A|-B> <INPUT_SEQ_FILE> <OUTPUT_FILE_PREFIX> <#_READS/#_READ_PAIRS_PER_AMPLICON>
OPTIONS
MANDATORY OPTIONS INPUT_SEQ_FILE - the filename of DNA/RNA reference sequences in FASTA format OUTPUT_FILE_PREFIX - the prefix or directory of output read data file (*.fq) and read alignment file (*.aln) FOLD_COVERAGE - the fold of read coverage over the reference sequences MEAN_FRAG_LEN - the average DNA fragment size for paired-end read simulation STD_DEV - the standard deviation of the DNA fragment size for paired-end read simulation #READS_PER_AMPLICON - number of reads per amplicon (for 5'end amplicon sequencing) #READ_PAIRS_PER_AMPLICON - number of read pairs per amplicon (for two-end amplicon sequencing) OPTIONAL PARAMETERS -A indicate to perform single-end amplicon sequencing simulation -B indicate to perform paired-end amplicon sequencing simulation -M indicate to use CIGAR 'M' instead of '=/X' for alignment match/mismatch -a indicate to output the ALN alignment file -s indicate to output the SAM alignment file -d print out warning messages for debugging -t indicate to simulate reads from the built-in GS FLX Titanium profile [default: GS FLX profile] -r specify a fixed random seed for the simulation (to generate two identical datasets from two different runs) -c specify the number of flow cycles by the sequencer [ default: 100 for GS-FLX, and 200 for GS-FLX Titanium ] -p specify user's own read profile for simulation NOTE: the name of a read profile is the directory containing read profile data files. please read the REAME file about the format of 454 read profile data files and. and the default filenames of these data files.
EXAMPLES
1) singl-end simulation with 20X coverage art_454 -s seq_reference.fa ./outdir/single_dat 20 2) paired-end simulation with the mean fragment size 1500 and STD 20 using GS FLX Titanium platform art_454 -s -t seq_reference.fa ./outdir/paired_dat 10 1500 20 3) paired-end simulation with a fixed random seed art_454 -s -r 777 seq_reference.fa ./outdir/paired_fxSeed 10 2500 50 4) single-end amplicon sequencing with 10 reads per amplicon art_454 -A -s amplicon_ref.fa ./outdir/amp_single 10 5) paired-end amplicon sequencing with 10 read pairs per amplicon art_454 -B -s amplicon_ref.fa ./outdir/amp_paired 10
AUTHOR
This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.