Provided by: samtools_1.17-1_amd64 bug

NAME

       samtools-import - converts FASTQ files to unmapped SAM/BAM/CRAM

SYNOPSIS

       samtools import [options] [ fastq_file ... ]

DESCRIPTION

       Reads  one  or more FASTQ files and converts them to unmapped SAM, BAM or CRAM.  The input
       files may be automatically decompressed if they have a .gz extension.

       The simplest usage in the absence of any other command line options is to provide  one  or
       two input files.

       If  a  single  file  is  given, it will be interpreted as a single-ended sequencing format
       unless the read names end with /1 and /2 in which case they will  be  labelled  as  PAIRED
       with  READ1  or  READ2  BAM flags set.  If a pair of filenames are given they will be read
       from alternately to produce an interleaved output file, also setting PAIRED  and  READ1  /
       READ2 flags.

       The  filenames  may be explicitly labelled using -1 and -2 for READ1 and READ2 data files,
       -s for an interleaved paired file (or one half of a paired-end run), -0 for unpaired  data
       and explicit index files specified with --i1 and --i2.  These correspond to typical output
       produced by Illumina bcl2fastq and match the output from samtools fastq.  The index  files
       will set both the BC barcode code and it's associated QT quality tag.

       The  Illumina  CASAVA identifiers may also be processed when the -i option is given.  This
       tag will be processed for READ1 / READ2, whether or not the read failed processing (QCFAIL
       flag),  and  the  barcode  sequence  which  will  be  added to the BC tag.  This can be an
       alternative to explicitly specifying the index files, although note that doing so will not
       fill out the barcode quality tag.

OPTIONS

       -s FILE Import paired interleaved data from FILE.

       -0 FILE Import single-ended (unpaired) data from FILE.

               Operationally  there  is  no  difference between the -s and -0 options as given an
               interleaved file with /1 and /2 read name endings  both  will  correctly  set  the
               PAIRED,  READ1  and  READ2  flags,  and  given data with no suffixes and no CASAVA
               identifiers being processed both will leave the data as unpaired.   However  their
               inclusion  here  is  for  more descriptive command lines and to improve the header
               comment describing the samtools fastq decode command.

       -1 FILE, -2 FILE
               Import paired data from a pair of FILEs.  The BAM flag PAIRED will be set, but not
               PROPER_PAIR  as  it has not been aligned.  READ1 and READ2 will be stored in their
               original, unmapped, orientation.

       --i1 FILE, --i2 FILE
               Specifies index barcodes associated with the -1  and  -2  files.   These  will  be
               appended to READ1 and READ2 records in the barcode (BC) and quality (QT) tags.

       -i      Specifies  that the Illumina CASAVA identifiers should be processed.  This may set
               the READ1, READ2 and QCFAIL flags and add a barcode tag.

       -N, --name2
               Assume the read names are encoded in the SRA and ENA formats where the first  word
               is  an automatically generated name with the second field being the original name.
               This option extracts that second field instead.

       --barcode-tag TAG
               Changes the auxiliary tag used for barcode sequence.  Defaults to BC.

       --quality-tag TAG
               Changes the auxiliary tag used for barcode quality.  Defaults to QT.

       -oFILE  Output to FILE.  By default output will be written to stdout.

       --order TAG
               When outputting a SAM record, also output an integer tag containing the Nth record
               number.  This may be useful if the data is to be sorted or collated in some manner
               and we wish this to be reversible.  In this case the tag may be used with samtools
               sort -t TAG to regenerate the original input order.

       -r RG_line, --rg-line RG_line
               A  complete  @RG  header  line may be specified, with or without the initial "@RG"
               component.  If specified this will also use the ID field from RG_line in each  SAM
               records RG auxiliary tag.

               If specified multiple times this appends to the RG line, automatically adding tabs
               between invocations.

       -R RG_ID, --rg RG_ID
               This is a shorter form of the option above, equivalent to --rg-line ID:RG_ID.   If
               both are specified then this option is ignored.

       -u      Output BAM or CRAM as uncompressed data.

       -T TAGLIST
               This  looks for any SAM-format auxiliary tags in the comment field of a fastq read
               name.   These  must  match  the  <alpha-num><alpha-num>:<type>:<data>  pattern  as
               specified  in  the  SAM  specification.  TAGLIST can be blank or * to indicate all
               tags should be copied to the output, otherwise it is a comma-separated list of tag
               types to include with all others being discarded.

EXAMPLES

       Convert a single-ended fastq file to an unmapped CRAM.  Both of these commands perform the
       same action.

           samtools import -0 in.fastq -o out.cram
           samtools import in.fastq > out.cram

       Convert a pair of Illumina fastqs containing CASAVA identifiers to BAM, adding the barcode
       information to the BC auxiliary tag.

           samtools import -i -1 in_1.fastq -2 in_2.fastq -o out.bam
           samtools import -i in_[12].fastq > out.bam

       Specify the read group. These commands are equivalent

           samtools import -r "$(echo -e 'ID:xyz\tPL:ILLUMINA')" in.fq
           samtools import -r "$(echo -e '@RG\tID:xyz\tPL:ILLUMINA')" in.fq
           samtools import -r ID:xyz -r PL:ILLUMINA in.fq

       Create  an unmapped BAM file from a set of 4 Illumina fastqs from bcf2fastq, consisting of
       two read and two index tags.  The CASAVA identifier is used only for  setting  QC  pass  /
       failure status.

           samtools import -i -1 R1.fq -2 R2.fq --i1 I1.fq --i2 I2.fq -o out.bam

       Convert  a pair of CASAVA barcoded fastq files to unmapped CRAM with an incremental record
       counter, then sort this by minimiser in order to reduce file space.  The reversal  process
       is also shown using samtools sort and samtools fastq.

           samtools import -i in_1.fq in_2.fq --order ro -O bam,level=0 | \
               samtools sort -@4 -M -o out.srt.cram -

           samtools sort -@4 -O bam -u -t ro out.srt.cram | \
               samtools fastq -1 out_1.fq -2 out_2.fq -i --index-format "i*i*"

AUTHOR

       Written by James Bonfield of the Wellcome Sanger Institute.

SEE ALSO

       samtools(1), samtools-fastq(1)

       Samtools website: <http://www.htslib.org/>