Ubuntu Manpage: salmon_index - highly-accurate, transcript-level quantification estimates from RNA-seq data

Provided by: salmon_1.4.0+ds1-1build1_amd64

NAME

       salmon_index - highly-accurate, transcript-level quantification estimates from RNA-seq data

DESCRIPTION

       Index ========== Creates a salmon index.

   Command Line Options:
       -v [ --version ]
              print version string

       -h [ --help ]
              produce help message

       -t [ --transcripts ] arg
              Transcript fasta file.

       -k [ --kmerLen ] arg (=31)
              The size of k-mers that should be used for the quasi index.

       -i [ --index ] arg
              salmon index.

       --gencode
              This  flag  will  expect  the  input  transcript fasta to be in GENCODE format, and will split the
              transcript name at the first '|' character.  These reduced names will be used in  the  output  and
              when looking for these transcripts in a gene to transcript GTF.

       --features
              This flag will expect the input reference to be in the tsv file format, and will split the feature
              name  at  the  first  'tab'  character.   These  reduced names will be used in the output and when
              looking for the sequence of the features.GTF.

       --keepDuplicates
              This flag will disable the default indexing behavior of  discarding  sequence-identical  duplicate
              transcripts.   If this flag is passed, then duplicate transcripts that appear in the input will be
              retained and quantified separately.

       -p [ --threads ] arg (=2)
              Number of threads to use during indexing.

       --keepFixedFasta
              Retain the fixed fasta file (without short transcripts and duplicates, clipped,  etc.)   generated
              during indexing

       -f [ --filterSize ] arg (=-1) The size of the Bloom filter that will be used
              by  TwoPaCo  during  indexing.  The filter will be of size 2^{filterSize}. The default value of -1
              means that the filter size will be automatically set based on the number of distinct k-mers in the
              input, as estimated by nthll.

       --tmpdir arg
              The directory location that will be used for TwoPaCo temporary files; it will be created  if  need
              be  and  be  removed  prior  to  indexing  completion.  The default value will cause a (temporary)
              subdirectory of the salmon index directory to be used for this purpose.

       --sparse
              Build the index using a  sparse  sampling  of  k-mer  positions  This  will  require  less  memory
              (especially  during quantification), but will take longer to construct and can slow down mapping /
              alignment

       -d [ --decoys ] arg
              Treat these sequences ids from the reference as the decoys that may have  sequence  homologous  to
              some  known  transcript.  for example in case of the genome, provide a list of chromosome name ---
              one per line

       --type arg (=puff)
              The type of index to build; the only option is "puff" in this version of salmon.

salmon --no-version-check index 1.4.0+ds1         December 2020                                  SALMON_INDEX(1)