Provided by: gasic_0.0.r19-8_all
NAME
create_matrix - calculate the genome abundance similarity matrix
SYNOPSIS
create_matrix [options] NAMES
DESCRIPTION
Calculate the similarity matrix. First, a set of reads is simulated for every reference genome using a read simulator from core/tools.py specified via -s. Second, the simulated reads of each species are mapped against all reference genomes using the mapper specified with -m. Third, the resulting SAM-files are analyzed to calculate the similarity matrix. The similarity matrix is stored as a numpy file (-o).
OPTIONS
NAMES Filename of the names file; the plain text names file should contain one name per line. The name is used as identifier in the whole algorithm. -h, --help show this help message and exit -s SIMULATOR, --simulator=SIMULATOR Identifier of read simulator defined in core/tools.py [default: none] -r REF, --reference=REF Reference sequence file pattern for the read simulator. Placeholder for the name is "%s". [default: ./ref/%s.fasta] -m MAPPER, --mapper=MAPPER Identifier of mapper defined in core/tools.py [default: none] -i INDEX, --index=INDEX Reference index files for the read mapper. Placeholder for the name is "%s". [default: ./ref/%s.fasta] -t TEMP, --temp=TEMP Directory to store temporary simulated datasets and SAM files. [default: ./temp] -o OUT, --output=OUT Output similarity matrix file. [default: ./similarity_matrix.npy]