Provided by: pbsim_1.0.3-1_amd64 bug

NAME

       pbsim - simulator for PacBio sequencing reads

SYNOPSIS

       pbsim options <reference.fasta>

DESCRIPTION

       The   pbsim   command  produces  simulated  PacBio  reads  for  reference  FASTA  sequence
       <reference.fasta>.

       Model  files  (parameters   for   the   --model-qc   option)   can   be   found   in   the
       /usr/share/pbsim/models directory.

OPTIONS

       The  options  for  pbsim  can  be  divided  into  general,  sampling-based and model-based
       simulation options.

   General options
       --prefix
              prefix of output files (sd).

       --data-type
              data type. CLR or CCS (CLR).

       --depth
              depth of coverage (CLR: 20.0, CCS: 50.0).

       --length-min
              minimum length (100).

       --length-max
              maximum length (CLR: 25000, CCS: 2500).

       --accuracy-min
              minimum accuracy (CLR: 0.75, CCS: fixed as 0.75). This option can be used  only  in
              case of CLR.

       --accuracy-max
              maximum  accuracy  (CLR: 1.00, CCS: fixed as 1.00). This option can be used only in
              case of CLR.

       --difference-ratio
              ratio of differences. substitution:insertion:deletion. Each value up to 1000  (CLR:
              10:60:30, CCS:6:21:73).

       --seed for a pseudorandom number generator (Unix time).

   Options for sampling-based simulation
       --sample-fastq
              FASTQ format file to sample.

       --sample-profile-id
              sample-fastq  (filtered)  profile ID. When using --sample-fastq, profile is stored.
              sample_profile_<ID>.fastq, and sample_profile_<ID>_.stats  are  created.  When  not
              using  --sample-fastq,  profile  is  re-used.  Note  that  when  profile  is  used,
              --length-min,max, --accuracy-min,max would be the same as the profile.

   Options for model-based simulation
       --model_qc
              model of quality code.

       --length-mean
              mean of length model (CLR: 3000.0, CCS:450.0).

       --length-sd
              standard deviation of length model (CLR: 2300.0, CCS: 170.0).

       --accuracy-mean
              mean of accuracy model (CLR: 0.78, CCS: fixed as 0.98). This  option  can  be  used
              only in case of CLR.

       --accuracy-sd
              standard  deviation  of accuracy model (CLR: 0.02, CCS: fixed as 0.02). This option
              can be used only in case of CLR.

EXAMPLES

       To run model-based simulation:

           pbsim --data-type CLR \
                 --depth 20 \
                 --model_qc /usr/share/pbsim/models/model_qc_clr \
                 reference.fasta

       In the example above, simulated read sequences  are  randomly  sampled  from  a  reference
       sequence ("reference.fasta") and differences (errors) of the sampled reads are introduced.
       Data type is CLR, and coverage depth is 20. If the reference sequence is multi-FASTA file,
       the  simulated  data  is  created  for each FASTA. Three output files are created for each
       FASTA. "sd_0001.ref" is a single-FASTA file which is copied from the  reference  sequence.
       "sd_0001.fastq"  is  a simulated read dataset in the FASTQ format. "sd_0001.maf" is a list
       of alignments between reference sequence and simulated reads in the MAF format. The length
       and accuracy of reads are simulated based on our model of PacBio read.

       To run sampling-based simulation:

           pbsim --data-type CLR \
                 --depth 20 \
                 --sample-fastq sample.fastq \
                 reference.fastaq

       In the sampling-based simulation, read length and quality score are the same as those of a
       read taken randomly in the sample PacBio dataset ("sample.fastq").

LICENSE

       pbsim is available under the terms of the GNU General Public License, version 2 (GPL-2).

AUTHORS

       Michiaki Hamada (mhamada@k.u-tokyo.ac.jp), Yukiteru Ono

                                           January 2016                                  PBSIM(1)