Provided by: tabix_0.2.6-1git1_i386 bug


       bgzip - Block compression/decompression utility

       tabix - Generic indexer for TAB-delimited genome position files


       bgzip [-cdhB] [-b virtualOffset] [-s size] [file]

       tabix  [-0lf]  [-p gff|bed|sam|vcf] [-s seqCol] [-b begCol] [-e endCol]
       [-S lineSkip] [-c metaChar] [region1 [region2 [...]]]


       Tabix indexes a  TAB-delimited  genome  position  file  and
       creates  an  index  file when region is absent from the
       command-line.  The  input  data  file  must  be  position  sorted   and
       compressed by bgzip which has a gzip(1) like interface. After indexing,
       tabix is able  to  quickly  retrieve  data  lines  overlapping  regions
       specified in the format "chr:beginPos-endPos". Fast data retrieval also
       works over network if URI is given as a file name and in this case  the
       index file will be downloaded if it is not present locally.


       -p STR    Input  format  for indexing. Valid values are: gff, bed, sam,
                 vcf and psltab. This option should not  be  applied  together
                 with  any  of  -s, -b, -e, -c and -0; it is not used for data
                 retrieval because this setting is stored in the  index  file.

       -s INT    Column of sequence name. Option -s, -b, -e, -S, -c and -0 are
                 all stored in the index  file  and  thus  not  used  in  data
                 retrieval. [1]

       -b INT    Column of start chromosomal position. [4]

       -e INT    Column of end chromosomal position. The end column can be the
                 same as the start column. [5]

       -S INT    Skip first INT lines in the data file. [0]

       -c CHAR   Skip lines started with character CHAR. [#]

       -0        Specify that the position in the data file is  0-based  (e.g.
                 UCSC files) rather than 1-based.

       -h        Print the header/meta lines.

       -B        The  second  argument  is  a BED file. When this option is in
                 use, the input file may not be sorted or indexed. The  entire
                 input  will  be  read  sequentially.  Nonetheless,  with this
                 option, the format of the input must be specificed  correctly
                 on the command line.

       -f        Force to overwrite the index file if it is present.

       -l        List the sequence names stored in the index file.


       (grep  ^"#"  in.gff; grep -v ^"#" in.gff | sort -k1,1 -k4,4n) | bgzip >

       tabix -p gff sorted.gff.gz;

       tabix sorted.gff.gz chr1:10,000,000-20,000,000;


       It is straightforward to achieve overlap queries using the standard  B-
       tree  index (with or without binning) implemented in all SQL databases,
       or the R-tree index in PostgreSQL and Oracle. But there are still  many
       reasons  to  use  tabix.  Firstly,  tabix  directly works with a lot of
       widely used TAB-delimited formats such as GFF/GTF and BED.  We  do  not
       need  to  design database schema or specialized binary formats. Data do
       not need to be duplicated in different formats, either. Secondly, tabix
       works  on  compressed  data  files while most SQL databases do not. The
       GenCode annotation GTF can be compressed down to 4%.  Thirdly, tabix is
       fast.  The  same indexing algorithm is known to work efficiently for an
       alignment with a few billion short reads. SQL databases probably cannot
       easily  handle  data  at  this  scale.  Last  but  not the least, tabix
       supports remote data retrieval. One can put the data file and the index
       at  an FTP or HTTP server, and other users or even web services will be
       able to get a slice without downloading the entire file.


       Tabix  was  written  by  Heng  Li.  The  BGZF  library  was  originally
       implemented  by  Bob  Handsaker and modified by Heng Li for remote file
       access and in-memory caching.