Provided by: samtools_1.19.2-1build2_amd64 bug

NAME

       samtools-split - splits a file by read group.

SYNOPSIS

       samtools split [options] merged.sam|merged.bam|merged.cram

DESCRIPTION

       Splits  a  file  by  read  group,  or  a specified tag, producing one or more output files
       matching a common prefix (by default based on the input filename).

       Unless the -d option is used, the file will be split according to the @RG tags  listed  in
       the  header.   Records  without  an  RG tag or with an RG tag undefined in the header will
       cause the program to exit with an error unless the -u option is used.

       RG values defined in the header but with no records  will  produce  an  output  file  only
       containing a header.

       If  the  -d  TAG option is used, the file will be split on the value in the given aux tag.
       Note that only string tags (type Z) are currently supported.   Unless  the  -u  option  is
       used, the program will exit with an error if it finds a record without the given tag.

       Note that attempting to split on a tag with high cardinality may result in the creation of
       a large number of output files.  To prevent this, the -M option can be used to set a limit
       on the number of splits made.

       Using  -d  RG behaves in a similar way to the default (without -d), opening an output file
       for each @RG line in the header.  However, unlike the default, new output  files  will  be
       opened  for  any  RG  tags  found  in the alignment records irrespective of if they have a
       matching header @RG line.

       The -u option may be used to specify the output filename for any records with a missing or
       unrecognised tag.  This option will always write out a file even if there are no records.

       Output  format defaults to BAM.  For SAM or CRAM then either set the format with --output-
       fmt or use -f to set the file extension e.g.  -f %*_%#.sam.

OPTIONS

       -u FILE1      Put reads with no tag or an unrecognised tag into FILE1

       -h FILE2      Use the header from FILE2 when writing the file  given  in  the  -u  option.
                     This  header  completely  replaces  the one from the input file.  It must be
                     compatible with the input file header, which means it  must  have  the  same
                     number  of  references listed in the @SQ lines and the references must be in
                     the same order and have the same lengths.

       -f STRING     Output filename format string (see below) ["%*_%#.%."]

       -d TAG        Split reads by TAG value into distinct files.  Only  the  TAG  key  must  be
                     supplied  with  the  option.  The  value of the TAG has to be a string (i.e.
                     key:Z:value )

       -M,--max-split NUM
                     Limit the number of files created by the -d option  to  NUM  (default  100).
                     This prevents accidents where trying to split on a tag with high cardinality
                     could result in the creation of a very large number of output  files.   Once
                     the  file  limit is reached, any tag values not already seen will be treated
                     as unmatched and the program will exit with an error unless the -u option is
                     in use.

                     If  desired,  the limit can be removed using -M -1, although in practice the
                     number of outputs will still be restricted by system limits on the number of
                     files that can be open at once.

                     If splitting by read group, and the read group count in the header is higher
                     than the requested limit then the limit will be raised to match.

       -v            Verbose output

       --no-PG       Do not add a @PG line to the header of the output file.

       Format string expansions:

                 %%   %
                 %*   basename
                 %#   index (of @RG in the header, or count of TAG values seen so far)
                 %!   @RG ID or TAG value
                 %.   output format filename extension

       -@, --threads INT
              Number of input/output compression threads to use in addition to main thread [0].

AUTHOR

       Written by Martin Pollard from the Sanger Institute.

SEE ALSO

       samtools(1), samtools-addreplacerg(1)

       Samtools website: <http://www.htslib.org/>