Provided by: pgloader_3.4.1+dfsg-1_amd64 bug

NAME

       pgloader - PostgreSQL data loader

SYNOPSIS

       pgloader [<options>] [<command-file>]...
       pgloader [<options>] SOURCE TARGET

DESCRIPTION

       pgloader  loads  data  from  various sources into PostgreSQL. It can transform the data it
       reads on the fly and submit raw SQL before  and  after  the  loading.  It  uses  the  COPY
       PostgreSQL  protocol  to  stream the data into the server, and manages errors by filling a
       pair of reject.dat and reject.log files.

       pgloader operates either using commands which are read from files:

           pgloader commands.load

       or by using arguments and options all provided on the command line:

           pgloader SOURCE TARGET

ARGUMENTS

       The pgloader arguments can be as many load files as needed,  or  a  couple  of  connection
       strings to a specific input file.

   SOURCE CONNECTION STRING
       The source connection string format is as follows:

           format:///absolute/path/to/file.ext
           format://./relative/path/to/file.ext

       Where format might be one of csv, fixed, copy, dbf, db3 or ixf.

           db://user:pass@host:port/dbname

       Where db might be of sqlite, mysql or mssql.

       When  using  a  file based source format, pgloader also support natively fetching the file
       from an http location and decompressing an archive if needed. In that case it´s  necessary
       to  use  the  --type  option  to specify the expected format of the file. See the examples
       below.

       Also note that some file formats require describing some implementation  details  such  as
       columns to be read and delimiters and quoting when loading from csv.

       For  more  complex loading scenarios, you will need to write a full fledge load command in
       the syntax described later in this document.

   TARGET CONNECTION STRING
       The target connection string format is described in details later in  this  document,  see
       Section Connection String.

OPTIONS

   INQUIRY OPTIONS
       Use  these  options when you want to know more about how to use pgloader, as those options
       will cause pgloader not to load any data.

       -h, --help
              Show command usage summary and exit.

       -V, --version
              Show pgloader version string and exit.

       -E, --list-encodings
              List known encodings in this version of pgloader.

       -U, --upgrade-config
              Parse given files in the command line as pgloader.conf files with  the  INI  syntax
              that  was  in  use  in pgloader versions 2.x, and output the new command syntax for
              pgloader on standard output.

   GENERAL OPTIONS
       Those options are meant to tweak pgloader behavior when loading data.

       •   -v, --verbose: Be verbose.

       •   -q, --quiet: Be quiet.

       •   -d, --debug: Show debug level information messages.

       •   -D, --root-dir: Set the root working directory (default to "/tmp/pgloader").

       •   -L, --logfile: Set the pgloader log file (default to "/tmp/pgloader.log").

       •   --log-min-messages: Minimum level of verbosity needed for log message to  make  it  to
           the logfile. One of critical, log, error, warning, notice, info or debug.

       •   --client-min-messages: Minimum level of verbosity needed for log message to make it to
           the console. One of critical, log, error, warning, notice, info or debug.

       •   -S, --summary: A filename where  to  copy  the  summary  output.  When  relative,  the
           filename is expanded into *root-dir*.

           The  format  of  the filename defaults to being human readable. It is possible to have
           the output in machine friendly formats  such  as  CSV,  COPY  (PostgreSQL´s  own  COPY
           format)  or  JSON  by  specifying  a  filename with the extension resp. .csv, .copy or
           .json.

       •   -l <file>, --load-lisp-file <file>: Specify a lisp file to compile and load  into  the
           pgloader  image  before  reading the commands, allowing to define extra transformation
           function. Those functions should be defined in the pgloader.transforms  package.  This
           option can appear more than once in the command line.

       •   --dry-run:

           Allow  testing  a  .load file without actually trying to load any data. It´s useful to
           debug it until it´s ok, in particular to fix connection strings.

       •   --on-error-stop

           Alter pgloader behavior: rather than trying to  be  smart  about  error  handling  and
           continue  loading  good  data,  separating  away  the  bad  one,  just stop as soon as
           PostgreSQL  refuses  anything  sent  to  it.  Useful   to   debug   data   processing,
           transformation function and specific type casting.

       •   --self-upgrade <directory>:

           Specify  a  directory  where  to  find  pgloader sources so that one of the very first
           things it does is dynamically loading-in  (and  compiling  to  machine  code)  another
           version of itself, usually a newer one like a very recent git checkout.

   COMMAND LINE ONLY OPERATIONS
       Those  options are meant to be used when using pgloader from the command line only, rather
       than using a command file and the rich command clauses and parser. In simple cases, it can
       be  much  easier to use the SOURCE and TARGET directly on the command line, then tweak the
       loading with those options:

       •   --with "option":

           Allows setting options from the command line. You can use that option as many times as
           you  want. The option arguments must follow the WITH clause for the source type of the
           SOURCE specification, as described later in this document.

       •   --set "guc_name=´value´"

           Allows setting PostgreSQL configuration from the command line. Note  that  the  option
           parsing  is  the same as when used from the SET command clause, in particular you must
           enclose the guc value with single-quotes.

       •   --field "..."

           Allows setting a source field definition. Fields are accumulated in the order given on
           the command line. It´s possible to either use a --field option per field in the source
           file, or to separate field definitions by a comma, as  you  would  do  in  the  HAVING
           FIELDS clause.

       •   --cast "..."

           Allows setting a specific casting rule for loading the data.

       •   --type csv|fixed|db3|ixf|sqlite|mysql|mssql

           Allows forcing the source type, in case when the SOURCE parsing isn´t satisfying.

       •   --encoding <encoding>

           Set the encoding of the source file to load data from.

       •   --before <filename>

           Parse  given  filename for SQL queries and run them against the target database before
           loading the data from the source. The queries are parsed by pgloader itself: they need
           to  be  terminated  by a semi-colon (;) and the file may include \i or \ir commands to
           include another file.

       •   --after <filename>

           Parse given filename for SQL queries and run them against the  target  database  after
           having loaded the data from the source. The queries are parsed in the same way as with
           the --before option, see above.

   MORE DEBUG INFORMATION
       To get the maximum amount of debug information, you can use both  the  --verbose  and  the
       --debug  switches  at  the  same time, which is equivalent to saying --client-min-messages
       data. Then the log messages will show the data being processed, in  the  cases  where  the
       code has explicit support for it.

USAGE EXAMPLES

       Review the command line options and pgloader´s version:

           pgloader --help
           pgloader --version

   Loading from a complex command
       Use  the  command file as the pgloader command argument, pgloader will parse that file and
       execute the commands found in it:

           pgloader --verbose ./test/csv-districts.load

   CSV
       Load data from a CSV file into a pre-existing table in your database:

           pgloader --type csv                                   \
                    --field id --field field                     \
                    --with truncate                              \
                    --with "fields terminated by ´,´"            \
                    ./test/data/matching-1.csv                   \
                    postgres:///pgloader?tablename=matching

       In that example the whole loading is driven from the command line, bypassing the need  for
       writing  a  command in the pgloader command syntax entirely. As there´s no command though,
       the extra inforamtion needed must be provided on the command line  using  the  --type  and
       --field and --with switches.

       For documentation about the available syntaxes for the --field and --with switches, please
       refer to the CSV section later in the man page.

       Note also that the PostgreSQL URI includes the target tablename.

   Reading from STDIN
       File based pgloader sources can be loaded from the standard input,  as  in  the  following
       example:

           pgloader --type csv                                         \
                    --field "usps,geoid,aland,awater,aland_sqmi,awater_sqmi,intptlat,intptlong" \
                    --with "skip header = 1"                          \
                    --with "fields terminated by ´\t´"                \
                    -                                                 \
                    postgresql:///pgloader?districts_longlat          \
                    < test/data/2013_Gaz_113CDs_national.txt

       The  dash  (-)  character  as  a  source  is used to mean standard input, as usual in Unix
       command lines. It´s possible to stream compressed content to pgloader with this technique,
       using the Unix pipe:

           gunzip -c source.gz | pgloader --type csv ... - pgsql:///target?foo

   Loading from CSV available through HTTP
       The  same  command  as just above can also be run if the CSV file happens to be found on a
       remote HTTP location:

           pgloader --type csv                                                     \
                    --field "usps,geoid,aland,awater,aland_sqmi,awater_sqmi,intptlat,intptlong" \
                    --with "skip header = 1"                                       \
                    --with "fields terminated by ´\t´"                             \
                    http://pgsql.tapoueh.org/temp/2013_Gaz_113CDs_national.txt     \
                    postgresql:///pgloader?districts_longlat

       Some more options have to be used in that case, as the file  contains  a  one-line  header
       (most  commonly  that´s column names, could be a copyright notice). Also, in that case, we
       specify all the fields right into a single --field option argument.

       Again, the PostgreSQL target connection string must contain the tablename option  and  you
       have  to  ensure that the target table exists and may fit the data. Here´s the SQL command
       used in that example in case you want to try it yourself:

           create table districts_longlat
           (
                    usps        text,
                    geoid       text,
                    aland       bigint,
                    awater      bigint,
                    aland_sqmi  double precision,
                    awater_sqmi double precision,
                    intptlat    double precision,
                    intptlong   double precision
           );

       Also notice that the same command will work against an archived version of the same  data,
       e.g. http://pgsql.tapoueh.org/temp/2013_Gaz_113CDs_national.txt.gz.

       Finally,  it´s important to note that pgloader first fetches the content from the HTTP URL
       it to a local file, then expand the archive when it´s recognized to be one, and only  then
       processes the locally expanded file.

       In  some  cases,  either because pgloader has no direct support for your archive format or
       maybe because expanding the archive is not feasible in your environment, you might want to
       stream  the  content  straight  from its remote location into PostgreSQL. Here´s how to do
       that, using the old battle tested Unix Pipes trick:

           curl http://pgsql.tapoueh.org/temp/2013_Gaz_113CDs_national.txt.gz \
           | gunzip -c                                                        \
           | pgloader --type csv                                              \
                      --field "usps,geoid,aland,awater,aland_sqmi,awater_sqmi,intptlat,intptlong"
                      --with "skip header = 1"                                \
                      --with "fields terminated by ´\t´"                      \
                      -                                                       \
                      postgresql:///pgloader?districts_longlat

       Now the OS will take care of the streaming and  buffering  between  the  network  and  the
       commands and pgloader will take care of streaming the data down to PostgreSQL.

   Migrating from SQLite
       The  following  command  will  open  the  SQLite database, discover its tables definitions
       including indexes and foreign keys, migrate those definitions while casting the data  type
       specifications to their PostgreSQL equivalent and then migrate the data over:

           createdb newdb
           pgloader ./test/sqlite/sqlite.db postgresql:///newdb

   Migrating from MySQL
       Just  create  a database where to host the MySQL data and definitions and have pgloader do
       the migration for you in a single command line:

           createdb pagila
           pgloader mysql://user@localhost/sakila postgresql:///pagila

   Fetching an archived DBF file from a HTTP remote location
       It´s possible for pgloader to download a file from HTTP, unarchive it, and only then  open
       it to discover the schema then load the data:

           createdb foo
           pgloader --type dbf http://www.insee.fr/fr/methodes/nomenclatures/cog/telechargement/2013/dbf/historiq2013.zip postgresql:///foo

       Here  it´s not possible for pgloader to guess the kind of data source it´s being given, so
       it´s necessary to use the --type command line switch.

BATCHES AND RETRY BEHAVIOUR

       To load data to PostgreSQL, pgloader uses the COPY streaming protocol. While this  is  the
       faster  way  to  load data, COPY has an important drawback: as soon as PostgreSQL emits an
       error with any bit of data sent to it, whatever the problem is,  the  whole  data  set  is
       rejected by PostgreSQL.

       To  work around that, pgloader cuts the data into batches of 25000 rows each, so that when
       a problem occurs it´s only impacting that many rows of data. Each batch is kept in  memory
       while the COPY streaming happens, in order to be able to handle errors should some happen.

       When PostgreSQL rejects the whole batch, pgloader logs the error message then isolates the
       bad row(s) from the accepted ones by retrying the batched rows in smaller batches.  To  do
       that,  pgloader  parses  the  CONTEXT  error  message from the failed COPY, as the message
       contains the line number where the error was found in  the  batch,  as  in  the  following
       example:

           CONTEXT: COPY errors, line 3, column b: "2006-13-11"

       Using  that  information,  pgloader will reload all rows in the batch before the erroneous
       one, log the erroneous one as rejected, then try loading the remaining of the batch  in  a
       single attempt, which may or may not contain other erroneous data.

       At  the  end  of  a load containing rejected rows, you will find two files in the root-dir
       location, under a directory named the same as the  target  database  of  your  setup.  The
       filenames  are  the  target table, and their extensions are .dat for the rejected data and
       .log for the file containing the full PostgreSQL client side logs about the rejected data.

       The .dat  file  is  formatted  in  PostgreSQL  the  text  COPY  format  as  documented  in
       http://www.postgresql.org/docs/9.2/static/sql-copy.html#AEN66609 .

A NOTE ABOUT PERFORMANCE

       pgloader has been developed with performance in mind, to be able to cope with ever growing
       needs in loading large amounts of data into PostgreSQL.

       The basic architecture it uses is the old Unix pipe model, where a thread  is  responsible
       for  loading  the  data  (reading a CSV file, querying MySQL, etc) and fills pre-processed
       data into a queue. Another threads feeds from the queue, apply some  more  transformations
       to the input data and stream the end result to PostgreSQL using the COPY protocol.

       When  given  a  file  that the PostgreSQL COPY command knows how to parse, and if the file
       contains no erroneous data, then pgloader  will  never  be  as  fast  as  just  using  the
       PostgreSQL COPY command.

       Note  that  while the COPY command is restricted to read either from its standard input or
       from a local file on the server´s file system, the command line  tool  psql  implements  a
       \copy  command  that  knows  how to stream a file local to the client over the network and
       into the PostgreSQL server, using the same protocol as pgloader uses.

A NOTE ABOUT PARALLELISM

       pgloader uses several concurrent tasks to process the data being loaded:

       •   a reader task reads the data in and pushes it to a queue,

       •   at last one write task feeds from the queue and formats the raw  into  the  PostgreSQL
           COPY  format  in  batches  (so that it´s possible to then retry a failed batch without
           reading the data from source again), and then sends the data to PostgreSQL  using  the
           COPY protocol.

       The  parameter  workers allows to control how many worker threads are allowed to be active
       at any time (that´s the parallelism  level);  and  the  parameter  concurrency  allows  to
       control  how  many  tasks are started to handle the data (they may not all run at the same
       time, depending on the workers setting).

       We allow workers simultaneous workers to be active at the same time in the  context  of  a
       single table. A single unit of work consist of several kinds of workers:

       •   a reader getting raw data from the source,

       •   N writers preparing and sending the data down to PostgreSQL.

       The  N  here is setup to the concurrency parameter: with a CONCURRENCY of 2, we start (+ 1
       2) = 3 concurrent tasks, with a concurrency of 4 we start (+ 1 4) = 9 concurrent tasks, of
       which only workers may be active simultaneously.

       The  defaults  are  workers  = 4, concurrency = 1 when loading from a database source, and
       workers = 8, concurrency = 2 when loading from something else (currently, a  file).  Those
       defaults  are  arbitrary and waiting for feedback from users, so please consider providing
       feedback if you play with the settings.

       As the CREATE INDEX threads started by pgloader are only waiting until PostgreSQL is  done
       with  the real work, those threads are NOT counted into the concurrency levels as detailed
       here.

       By default, as many CREATE INDEX threads as the maximum number of indexes  per  table  are
       found  in  your  source  schema.  It is possible to set the max parallel create index WITH
       option to another number in case there´s just too many of them to create.

SOURCE FORMATS

       pgloader supports the following input formats:

       •   csv, which includes also tsv and other  common  variants  where  you  can  change  the
           separator and the quoting rules and how to escape the quotes themselves;

       •   fixed  columns file, where pgloader is flexible enough to accomodate with source files
           missing columns (ragged fixed length column files do exist);

       •   PostgreSLQ COPY formatted files, following the COPY TEXT documentation of  PostgreSQL,
           such as the reject files prepared by pgloader;

       •   dbase files known as db3 or dbf file;

       •   ixf formated files, ixf being a binary storage format from IBM;

       •   sqlite databases with fully automated discovery of the schema and advanced cast rules;

       •   mysql databases with fully automated discovery of the schema and advanced cast rules;

       •   MS SQL databases with fully automated discovery of the schema and advanced cast rules.

PGLOADER COMMANDS SYNTAX

       pgloader  implements  a  Domain  Specific  Language allowing to setup complex data loading
       scripts handling computed columns and on-the-fly sanitization of the input data. For  more
       complex  data  loading  scenarios,  you  will be required to learn that DSL´s syntax. It´s
       meant to look familiar to DBA by being inspired by SQL where it makes sense, which is  not
       that much after all.

       The  pgloader  commands  follow  the same global grammar rules. Each of them might support
       only a subset of the general options and provide specific options.

           LOAD <source-type>
                FROM <source-url>     [ HAVING FIELDS <source-level-options> ]
                INTO <postgresql-url> [ TARGET COLUMNS <columns-and-options> ]

           [ WITH <load-options> ]

           [ SET <postgresql-settings> ]

           [ BEFORE LOAD [ DO <sql statements> | EXECUTE <sql file> ] ... ]
           [  AFTER LOAD [ DO <sql statements> | EXECUTE <sql file> ] ... ]
           ;

       The main clauses are the LOAD, FROM, INTO and WITH clauses that each  command  implements.
       Some  command  then  implement  the SET command, or some specific clauses such as the CAST
       clause.

COMMON CLAUSES

       Some clauses are common to all commands:

       •   FROM

           The FROM clause specifies where to read the data from, and each command introduces its
           own  variant  of  sources.  For  instance,  the  CSV  source supports inline, stdin, a
           filename, a quoted filename, and a FILENAME MATCHING clause (see above);  whereas  the
           MySQL source only supports a MySQL database URI specification.

           In  all  cases, the FROM clause is able to read its value from an environment variable
           when using the form GETENV ´varname´.

       •   INTO

           The PostgreSQL connection URI must contains the name of the target table where to load
           the  data  into. That table must have already been created in PostgreSQL, and the name
           might be schema qualified.

           The INTO target database connection URI can be parsed from the value of an environment
           variable when using the form GETENV ´varname´.

           Then  INTO  option  also  supports an optional comma separated list of target columns,
           which are either the name of an input field or the white space separated list  of  the
           target column name, its PostgreSQL data type and a USING expression.

           The  USING  expression  can  be  any  valid Common Lisp form and will be read with the
           current package set to pgloader.transforms, so that you can use functions  defined  in
           that  package,  such  as  functions  loaded  dynamically  with the --load command line
           parameter.

           Each USING expression is compiled at runtime to native code.

           This feature allows pgloader to load any number  of  fields  in  a  CSV  file  into  a
           possibly  different  number  of  columns  in  the database, using custom code for that
           projection.

       •   WITH

           Set of options to apply to the command, using a global syntax of either:

       •   key = valueuse optiondo not use option

       See each specific command for details.

       All data sources specific commands support the following options:

       •   on error stopbatch rows = Rbatch size = ... MBprefetch rows = ...

       See the section BATCH BEHAVIOUR OPTIONS for more details.

       In addition, the following settings are available:

       •   workers = Wconcurrency = Cmax parallel create index = I

       See section A NOTE ABOUT PARALLELISM for more details.

       •   SET

           This clause allows to specify session parameters to be set for all the sessions opened
           by  pgloader.  It  expects  a  list  of  parameter  name,  the  equal  sign,  then the
           single-quoted value as a comma separated list.

           The names and values of the parameters are not validated by pgloader, they  are  given
           as-is to PostgreSQL.

       •   BEFORE LOAD DO

           You  can  run  SQL  queries  against the database before loading the data from the CSV
           file. Most common SQL queries are CREATE TABLE IF NOT EXISTS so that the data  can  be
           loaded.

           Each  command  must be dollar-quoted: it must begin and end with a double dollar sign,
           $$. Dollar-quoted queries are then comma separated. No extra punctuation  is  expected
           after the last SQL query.

       •   BEFORE LOAD EXECUTE

           Same  behaviour  as  in  the BEFORE LOAD DO clause. Allows you to read the SQL queries
           from a SQL file. Implements support for PostgreSQL dollar-quoting and the \i  and  \ir
           include facilities as in psql batch mode (where they are the same thing).

       •   AFTER LOAD DO

           Same  format  as  BEFORE  LOAD DO, the dollar-quoted queries found in that section are
           executed once the  load  is  done.  That´s  the  right  time  to  create  indexes  and
           constraints, or re-enable triggers.

       •   AFTER LOAD EXECUTE

           Same behaviour as in the AFTER LOAD DO clause. Allows you to read the SQL queries from
           a SQL file. Implements support for  PostgreSQL  dollar-quoting  and  the  \i  and  \ir
           include facilities as in psql batch mode (where they are the same thing).

   Connection String
       The  <postgresql-url>  parameter is expected to be given as a Connection URI as documented
       in               the               PostgreSQL               documentation               at
       http://www.postgresql.org/docs/9.3/static/libpq-connect.html#LIBPQ-CONNSTRING.

           postgresql://[user[:password]@][netloc][:port][/dbname][?option=value&...]

       Where:

       •   user

           Can  contain  any  character,  including colon (:) which must then be doubled (::) and
           at-sign (@) which must then be doubled (@@).

           When omitted, the user name defaults to the value of the PGUSER environment  variable,
           and if it is unset, the value of the USER environment variable.

       •   password

           Can  contain any character, including the at sign (@) which must then be doubled (@@).
           To leave the password empty, when the user name ends with at at sign, you then have to
           use the syntax user:@.

           When  omitted,  the  password  defaults  to  the  value  of the PGPASSWORD environment
           variable if it is set, otherwise the password is left unset.

       •   netloc

           Can be either a hostname in dotted notation, or an ipv4,  or  an  Unix  domain  socket
           path.  Empty  is  the  default  network location, under a system providing unix domain
           socket that method is preferred, otherwise the netloc default to localhost.

           It´s  possible  to  force  the  unix  domain  socket  path   by   using   the   syntax
           unix:/path/to/where/the/socket/file/is,  so  to  force a non default socket path and a
           non default port, you would have:

           postgresql://unix:/tmp:54321/dbname

       The netloc defaults to the value of the PGHOST environment variable, and if it  is  unset,
       to  either  the  default  unix  socket  path  when running on a Unix system, and localhost
       otherwise.

       •   dbname

           Should be a proper identifier (letter followed by a mix of  letters,  digits  and  the
           punctuation signs comma (,), dash (-) and underscore (_).

           When omitted, the dbname defaults to the value of the environment variable PGDATABASE,
           and if that is unset, to the user value as determined above.

       •   options

           The optional parameters must be supplied with the form name=value,  and  you  may  use
           several parameters by separating them away using an ampersand (&) character.

           Only  some  options  are  supported  here,  tablename (which might be qualified with a
           schema name) sslmode, host, port, dbname, user and password.

           The sslmode parameter values can be one of disable, allow, prefer or require.

           For backward compatibility reasons, it´s possible  to  specify  the  tablename  option
           directly, without spelling out the tablename= parts.

           The  options  override  the  main  URI  components  when both are given, and using the
           percent-encoded option parameters allow using passwords  starting  with  a  colon  and
           bypassing other URI components parsing limitations.

   Regular Expressions
       Several  clauses  listed  in  the  following accept regular expressions with the following
       input rules:

       •   A regular expression begins with a tilde sign (~),

       •   is then followed with an opening sign,

       •   then any character is allowed and considered part of the  regular  expression,  except
           for the closing sign,

       •   then a closing sign is expected.

       The  opening  and  closing  sign  are allowed by pair, here´s the complete list of allowed
       delimiters:

           ~//
           ~[]
           ~{}
           ~()
           ~<>
           ~""
           ~´´
           ~||
           ~##

       Pick the set of delimiters that don´t collide with the regular expression you´re trying to
       input.  If  your  expression is such that none of the solutions allow you to enter it, the
       places where such expressions are allowed should allow for a list of expressions.

   Comments
       Any command may contain comments, following those input rules:

       •   the -- delimiter begins a comment that ends with the end of the current line,

       •   the delimiters /* and */ respectively start and end a comment, which can be  found  in
           the middle of a command or span several lines.

       Any place where you could enter a whitespace will accept a comment too.

   Batch behaviour options
       All  pgloader  commands have support for a WITH clause that allows for specifying options.
       Some options are generic and accepted  by  all  commands,  such  as  the  batch  behaviour
       options,  and some options are specific to a data source kind, such as the CSV skip header
       option.

       The global batch behaviour options are:

       •   batch rows

           Takes a numeric value as argument, used as the maximum number of  rows  allowed  in  a
           batch.  The  default  is  25  000  and can be changed to try having better performance
           characteristics or to control pgloader memory usage;

       •   batch size

           Takes a memory  unit  as  argument,  such  as  20  MB,  its  default  value.  Accepted
           multipliers  are kB, MB, GB, TB and PB. The case is important so as not to be confused
           about bits versus bytes, we´re only talking bytes here.

       •   prefetch rows

           Takes a numeric value as argument, defaults to 100000. That´s the number of rows  that
           pgloader  is  allowed to read in memory in each reader thread. See the workers setting
           for how many reader threads are allowed to run at the same time.

       Other options are specific to each input source, please refer to  specific  parts  of  the
       documentation for their listing and covering.

       A  batch  is  then  closed as soon as either the batch rows or the batch size threshold is
       crossed, whichever comes first. In cases when a batch has to  be  closed  because  of  the
       batch size setting, a debug level log message is printed with how many rows did fit in the
       oversized batch.

LOAD CSV

       This command instructs pgloader to load data from a CSV file. Here´s an example:

           LOAD CSV
              FROM ´GeoLiteCity-Blocks.csv´ WITH ENCODING iso-646-us
                   HAVING FIELDS
                   (
                      startIpNum, endIpNum, locId
                   )
              INTO postgresql://user@localhost:54393/dbname?geolite.blocks
                   TARGET COLUMNS
                   (
                      iprange ip4r using (ip-range startIpNum endIpNum),
                      locId
                   )
              WITH truncate,
                   skip header = 2,
                   fields optionally enclosed by ´"´,
                   fields escaped by backslash-quote,
                   fields terminated by ´\t´

               SET work_mem to ´32 MB´, maintenance_work_mem to ´64 MB´;

       The csv format command accepts the following clauses and options:

       •   FROM

           Filename  where  to  load  the  data  from.  Accepts  an  ENCODING  option.  Use   the
           --list-encodings option to know which encoding names are supported.

           The  filename  may  be  enclosed  by  single quotes, and could be one of the following
           special values:

       •   inline

           The data is found after the end of the parsed commands.  Any  number  of  empty  lines
           between the end of the commands and the beginning of the data is accepted.

       •   stdin

           Reads the data from the standard input stream.

       •   FILENAMES MATCHING

           The whole matching clause must follow the following rule:

           [ ALL FILENAMES | [ FIRST ] FILENAME ]
           MATCHING regexp
           [ IN DIRECTORY ´...´ ]

       The  matching clause applies given regular expression (see above for exact syntax, several
       options can be used here) to filenames. It´s then possible to  load  data  from  only  the
       first match of all of them.

       The optional IN DIRECTORY clause allows specifying which directory to walk for finding the
       data files, and can be either relative  to  where  the  command  file  is  read  from,  or
       absolute. The given directory must exists.

       The  FROM  option also supports an optional comma separated list of field names describing
       what is expected in the CSV data file, optionally introduced by the clause HAVING FIELDS.

       Each field name can be either only one name or  a  name  following  with  specific  reader
       options  for  that  field,  enclosed  in  square  brackets  and comma-separated. Supported
       per-field reader options are:

       •   terminated by

           See the description of field terminated by below.

           The processing of this option is not currently implemented.

       •   date format

           When the field is expected of the date type, then this option allows  to  specify  the
           date format used in the file.

           Date  format  string  are  template  strings  modeled  against  the PostgreSQL to_char
           template strings support, limited to the following patterns:

       •   YYYY, YYY, YY for the year part

       •   MM for the numeric month part

       •   DD for the numeric day part

       •   HH, HH12, HH24 for the hour part

       •   am, AM, a.m., A.M.

       •   pm, PM, p.m., P.M.

       •   MI for the minutes part

       •   SS for the seconds part

       •   MS for the milliseconds part (4 digits)

       •   US for the microseconds part (6 digits)

       •   unparsed punctuation signs: - . * # @ T / \ and space

       Here´s an example of a date format specification:

           column-name [date format ´YYYY-MM-DD HH24-MI-SS.US´]

       •   null if

           This option takes an argument which is either the keyword blanks  or  a  double-quoted
           string.

           When  blanks  is used and the field value that is read contains only space characters,
           then it´s automatically converted to an SQL NULL value.

           When a double-quoted string is used and that string is read as the field  value,  then
           the field value is automatically converted to an SQL NULL value.

       •   trim both whitespace, trim left whitespace, trim right whitespace

           This option allows to trim whitespaces in the read data, either from both sides of the
           data, or only the whitespace characters found on the left of  the  streaing,  or  only
           those on the right of the string.

       •   WITH

           When loading from a CSV file, the following options are supported:

       •   truncate

           When  this option is listed, pgloader issues a TRUNCATE command against the PostgreSQL
           target table before reading the data file.

       •   drop indexes

           When this option is listed, pgloader  issues  DROP  INDEX  commands  against  all  the
           indexes  defined  on  the  target  table  before  copying  the data, then CREATE INDEX
           commands once the COPY is done.

           In order to get the best performance possible, all the indexes are created in parallel
           and  when  done the primary keys are built again from the unique indexes just created.
           This two step process allows creating the primary key index in parallel with the other
           indexes,  as only the ALTER TABLE command needs an access exclusive lock on the target
           table.

       •   disable triggers

           When this option is listed, pgloader issues an ALTER TABLE  ...  DISABLE  TRIGGER  ALL
           command  against the PostgreSQL target table before copying the data, then the command
           ALTER TABLE ... ENABLE TRIGGER ALL once the COPY is done.

           This option allows loading data into a pre-existing table  ignoring  the  foreign  key
           constraints  and  user  defined  triggers  and  may  result  in  invalid  foreign  key
           constraints once the data is loaded. Use with care.

       •   skip header

           Takes a numeric value as argument. Instruct pgloader to skip that many  lines  at  the
           beginning of the input file.

       •   csv header

           Use  the  first line read after skip header as the list of csv field names to be found
           in the CSV file, using the same CSV parameters as for the CSV data.

       •   trim unquoted blanks

           When reading unquoted values in the CSV file, remove the blanks found in  between  the
           separator and the value. That behaviour is the default.

       •   keep unquoted blanks

           When  reading  unquoted  values  in  the  CSV  file,  keep blanks found in between the
           separator and the value.

       •   fields optionally enclosed by

           Takes a single character as argument, which must be found inside  single  quotes,  and
           might  be  given  as  the printable character itself, the special value \t to denote a
           tabulation character, or 0x then an hexadecimal value read as the ASCII code  for  the
           character.

           This  character  is  used  as  the  quoting character in the CSV file, and defaults to
           double-quote.

       •   fields not enclosed

           By default, pgloader will use the double-quote character as the  enclosing  character.
           If  you have a CSV file where fields are not enclosed and are using double-quote as an
           expected ordinary character, then use the option  fields  not  enclosed  for  the  CSV
           parser to accept those values.

       •   fields escaped by

           Takes either the special value backslash-quote or double-quote, or any value supported
           by the fields terminated by option (see  below).  This  value  is  used  to  recognize
           escaped  field separators when they are to be found within the data fields themselves.
           Defaults to double-quote.

       •   csv escape mode

           Takes either the special value quote (the default) or following  and  allows  the  CSV
           parser  to  parse  either only escaped field separator or any character (including CSV
           data) when using the following value.

       •   fields terminated by

           Takes a single character as argument, which must be found inside  single  quotes,  and
           might  be  given  as  the printable character itself, the special value \t to denote a
           tabulation character, or 0x then an hexadecimal value read as the ASCII code  for  the
           character.

           This character is used as the field separator when reading the CSV data.

       •   lines terminated by

           Takes  a  single  character as argument, which must be found inside single quotes, and
           might be given as the printable character itself, the special value  \t  to  denote  a
           tabulation  character,  or 0x then an hexadecimal value read as the ASCII code for the
           character.

           This character is used to recognize end-of-line condition when reading the CSV data.

LOAD FIXED COLS

       This command instructs pgloader to load data from a text file containing columns  arranged
       in a fixed size manner. Here´s an example:

           LOAD FIXED
                FROM inline
                     (
                      a from  0 for 10,
                      b from 10 for  8,
                      c from 18 for  8,
                      d from 26 for 17 [null if blanks, trim right whitespace]
                     )
                INTO postgresql:///pgloader?fixed
                     (
                        a, b,
                        c time using (time-with-no-separator c),
                        d
                     )

                WITH truncate

                 SET work_mem to ´14MB´,
                     standard_conforming_strings to ´on´

           BEFORE LOAD DO
                $$ drop table if exists fixed; $$,
                $$ create table fixed (
                    a integer,
                    b date,
                    c time,
                    d text
                   );
                $$;

            01234567892008052011431250firstline
               01234562008052115182300left blank-padded
            12345678902008052208231560another line
             2345609872014092914371500
             2345678902014092914371520

       The fixed format command accepts the following clauses and options:

       •   FROM

           Filename   where  to  load  the  data  from.  Accepts  an  ENCODING  option.  Use  the
           --list-encodings option to know which encoding names are supported.

           The filename may be enclosed by single quotes, and  could  be  one  of  the  following
           special values:

       •   inline

           The  data  is  found  after  the end of the parsed commands. Any number of empty lines
           between the end of the commands and the beginning of the data is accepted.

       •   stdin

           Reads the data from the standard input stream.

       •   FILENAMES MATCHING

           The whole matching clause must follow the following rule:

           [ ALL FILENAMES | [ FIRST ] FILENAME ]
           MATCHING regexp
           [ IN DIRECTORY ´...´ ]

       The matching clause applies given regular expression (see above for exact syntax,  several
       options  can  be  used  here)  to filenames. It´s then possible to load data from only the
       first match of all of them.

       The optional IN DIRECTORY clause allows specifying which directory to walk for finding the
       data  files,  and  can  be  either  relative  to  where  the command file is read from, or
       absolute. The given directory must exists.

       The FROM option also supports an optional comma separated list of field  names  describing
       what is expected in the FIXED data file.

       Each  field  name  is composed of the field name followed with specific reader options for
       that field. Supported per-field reader options are the following,  where  only  start  and
       length are required.

       •   start

           Position  in  the  line where to start reading that field´s value. Can be entered with
           decimal digits or 0x then hexadecimal digits.

       •   length

           How many bytes to read from the start position to read that field´s value. Same format
           as start.

       Those optional parameters must be enclosed in square brackets and comma-separated:

       •   terminated by

           See the description of field terminated by below.

           The processing of this option is not currently implemented.

       •   date format

           When  the  field  is expected of the date type, then this option allows to specify the
           date format used in the file.

           Date format string  are  template  strings  modeled  against  the  PostgreSQL  to_char
           template strings support, limited to the following patterns:

       •   YYYY, YYY, YY for the year part

       •   MM for the numeric month part

       •   DD for the numeric day part

       •   HH, HH12, HH24 for the hour part

       •   am, AM, a.m., A.M.

       •   pm, PM, p.m., P.M.

       •   MI for the minutes part

       •   SS for the seconds part

       •   MS for the milliseconds part (4 digits)

       •   US for the microseconds part (6 digits)

       •   unparsed punctuation signs: - . * # @ T / \ and space

       Here´s an example of a date format specification:

           column-name [date format ´YYYY-MM-DD HH24-MI-SS.US´]

       •   null if

           This  option  takes  an argument which is either the keyword blanks or a double-quoted
           string.

           When blanks is used and the field value that is read contains only  space  characters,
           then it´s automatically converted to an SQL NULL value.

           When  a  double-quoted string is used and that string is read as the field value, then
           the field value is automatically converted to an SQL NULL value.

       •   trim both whitespace, trim left whitespace, trim right whitespace

           This option allows to trim whitespaces in the read data, either from both sides of the
           data,  or  only  the  whitespace characters found on the left of the streaing, or only
           those on the right of the string.

       •   WITH

           When loading from a FIXED file, the following options are supported:

       •   truncate

           When this option is listed, pgloader issues a TRUNCATE command against the  PostgreSQL
           target table before reading the data file.

       •   disable triggers

           When  this  option  is  listed, pgloader issues an ALTER TABLE ... DISABLE TRIGGER ALL
           command against the PostgreSQL target table before copying the data, then the  command
           ALTER TABLE ... ENABLE TRIGGER ALL once the COPY is done.

           This  option  allows  loading  data into a pre-existing table ignoring the foreign key
           constraints  and  user  defined  triggers  and  may  result  in  invalid  foreign  key
           constraints once the data is loaded. Use with care.

       •   skip header

           Takes  a  numeric  value as argument. Instruct pgloader to skip that many lines at the
           beginning of the input file.

LOAD COPY FORMATTED FILES

       This commands instructs pgloader to  load  from  a  file  containing  COPY  TEXT  data  as
       described in the PostgreSQL documentation. Here´s an example:

           LOAD COPY
                FROM copy://./data/track.copy
                     (
                       trackid, track, album, media, genre, composer,
                       milliseconds, bytes, unitprice
                     )
                INTO postgresql:///pgloader?track_full

                WITH truncate

                 SET work_mem to ´14MB´,
                     standard_conforming_strings to ´on´

           BEFORE LOAD DO
                $$ drop table if exists track_full; $$,
                $$ create table track_full (
                     trackid      bigserial,
                     track        text,
                     album        text,
                     media        text,
                     genre        text,
                     composer     text,
                     milliseconds bigint,
                     bytes        bigint,
                     unitprice    numeric
                   );
                $$;

       The COPY format command accepts the following clauses and options:

       •   FROM

           Filename  where  to  load  the  data from. This support local files, HTTP URLs and zip
           files containing a single dbf file of the same name. Fetch such a  zip  file  from  an
           HTTP address is of course supported.

       •   inline

           The  data  is  found  after  the end of the parsed commands. Any number of empty lines
           between the end of the commands and the beginning of the data is accepted.

       •   stdin

           Reads the data from the standard input stream.

       •   FILENAMES MATCHING

           The whole matching clause must follow the following rule:

           [ ALL FILENAMES | [ FIRST ] FILENAME ]
           MATCHING regexp
           [ IN DIRECTORY ´...´ ]

       The matching clause applies given regular expression (see above for exact syntax,  several
       options  can  be  used  here)  to filenames. It´s then possible to load data from only the
       first match of all of them.

       The optional IN DIRECTORY clause allows specifying which directory to walk for finding the
       data  files,  and  can  be  either  relative  to  where  the command file is read from, or
       absolute. The given directory must exists.

       •   WITH

           When loading from a COPY file, the following options are supported:

       •   delimiter

           Takes a single character as argument, which must be found inside  single  quotes,  and
           might  be  given  as  the printable character itself, the special value \t to denote a
           tabulation character, or 0x then an hexadecimal value read as the ASCII code  for  the
           character.

           This character is used as the delimiter when reading the data, in a similar way to the
           PostgreSQL COPY option.

       •   null

           Takes a quoted string as an argument (quotes can be either  double  quotes  or  single
           quotes) and uses that string as the NULL representation in the data.

           This is similar to the null COPY option in PostgreSQL.

       •   truncate

           When  this option is listed, pgloader issues a TRUNCATE command against the PostgreSQL
           target table before reading the data file.

       •   disable triggers

           When this option is listed, pgloader issues an ALTER TABLE  ...  DISABLE  TRIGGER  ALL
           command  against the PostgreSQL target table before copying the data, then the command
           ALTER TABLE ... ENABLE TRIGGER ALL once the COPY is done.

           This option allows loading data into a pre-existing table  ignoring  the  foreign  key
           constraints  and  user  defined  triggers  and  may  result  in  invalid  foreign  key
           constraints once the data is loaded. Use with care.

       •   skip header

           Takes a numeric value as argument. Instruct pgloader to skip that many  lines  at  the
           beginning of the input file.

LOAD DBF

       This command instructs pgloader to load data from a DBF file. Here´s an example:

           LOAD DBF
               FROM http://www.insee.fr/fr/methodes/nomenclatures/cog/telechargement/2013/dbf/reg2013.dbf
               INTO postgresql://user@localhost/dbname
               WITH truncate, create table;

       The dbf format command accepts the following clauses and options:

       •   FROM

           Filename  where  to  load  the  data from. This support local files, HTTP URLs and zip
           files containing a single dbf file of the same name. Fetch such a  zip  file  from  an
           HTTP address is of course supported.

       •   WITH

           When loading from a DBF file, the following options are supported:

       •   truncate

           When  this option is listed, pgloader issues a TRUNCATE command against the PostgreSQL
           target table before reading the data file.

       •   disable triggers

           When this option is listed, pgloader issues an ALTER TABLE  ...  DISABLE  TRIGGER  ALL
           command  against the PostgreSQL target table before copying the data, then the command
           ALTER TABLE ... ENABLE TRIGGER ALL once the COPY is done.

           This option allows loading data into a pre-existing table  ignoring  the  foreign  key
           constraints  and  user  defined  triggers  and  may  result  in  invalid  foreign  key
           constraints once the data is loaded. Use with care.

       •   create table

           When this option is listed, pgloader creates the table using the meta  data  found  in
           the  DBF  file,  which  must contain a list of fields with their data type. A standard
           data type conversion from DBF to PostgreSQL is done.

       •   table name

           This options expects as its value the possibly qualified name of the table to create.

LOAD IXF

       This command instructs pgloader to load data from an IBM IXF file. Here´s an example:

           LOAD IXF
               FROM data/nsitra.test1.ixf
               INTO postgresql:///pgloader?nsitra.test1
               WITH truncate, create table, timezone UTC

             BEFORE LOAD DO
              $$ create schema if not exists nsitra; $$,
              $$ drop table if exists nsitra.test1; $$;

       The ixf format command accepts the following clauses and options:

       •   FROM

           Filename where to load the data from. This support local  files,  HTTP  URLs  and  zip
           files  containing  a  single  ixf file of the same name. Fetch such a zip file from an
           HTTP address is of course supported.

       •   WITH

           When loading from a IXF file, the following options are supported:

       •   truncate

           When this option is listed, pgloader issues a TRUNCATE command against the  PostgreSQL
           target table before reading the data file.

       •   disable triggers

           When  this  option  is  listed, pgloader issues an ALTER TABLE ... DISABLE TRIGGER ALL
           command against the PostgreSQL target table before copying the data, then the  command
           ALTER TABLE ... ENABLE TRIGGER ALL once the COPY is done.

           This  option  allows  loading  data into a pre-existing table ignoring the foreign key
           constraints  and  user  defined  triggers  and  may  result  in  invalid  foreign  key
           constraints once the data is loaded. Use with care.

       •   create table

           When  this  option  is listed, pgloader creates the table using the meta data found in
           the DBF file, which must contain a list of fields with their  data  type.  A  standard
           data type conversion from DBF to PostgreSQL is done.

       •   table name

           This options expects as its value the possibly qualified name of the table to create.

       •   timezone

           This  options allows to specify which timezone is used when parsing timestamps from an
           IXF file, and defaults to UTC. Expected values are either UTC, GMT or a single  quoted
           location name such as ´Universal´ or ´Europe/Paris´.

LOAD ARCHIVE

       This  command  instructs  pgloader  to  load  data  from one or more files contained in an
       archive. Currently the only supported archive format is ZIP,  and  the  archive  might  be
       downloaded from an HTTP URL.

       Here´s an example:

           LOAD ARCHIVE
              FROM /Users/dim/Downloads/GeoLiteCity-latest.zip
              INTO postgresql:///ip4r

              BEFORE LOAD
                DO $$ create extension if not exists ip4r; $$,
                   $$ create schema if not exists geolite; $$,

                EXECUTE ´geolite.sql´

              LOAD CSV
                   FROM FILENAME MATCHING ~/GeoLiteCity-Location.csv/
                        WITH ENCODING iso-8859-1
                        (
                           locId,
                           country,
                           region     null if blanks,
                           city       null if blanks,
                           postalCode null if blanks,
                           latitude,
                           longitude,
                           metroCode  null if blanks,
                           areaCode   null if blanks
                        )
                   INTO postgresql:///ip4r?geolite.location
                        (
                           locid,country,region,city,postalCode,
                           location point using (format nil "(~a,~a)" longitude latitude),
                           metroCode,areaCode
                        )
                   WITH skip header = 2,
                        fields optionally enclosed by ´"´,
                        fields escaped by double-quote,
                        fields terminated by ´,´

             AND LOAD CSV
                   FROM FILENAME MATCHING ~/GeoLiteCity-Blocks.csv/
                        WITH ENCODING iso-8859-1
                        (
                           startIpNum, endIpNum, locId
                        )
                   INTO postgresql:///ip4r?geolite.blocks
                        (
                           iprange ip4r using (ip-range startIpNum endIpNum),
                           locId
                        )
                   WITH skip header = 2,
                        fields optionally enclosed by ´"´,
                        fields escaped by double-quote,
                        fields terminated by ´,´

              FINALLY DO
                $$ create index blocks_ip4r_idx on geolite.blocks using gist(iprange); $$;

       The archive command accepts the following clauses and options:

       •   FROM

           Filename  or  HTTP  URI where to load the data from. When given an HTTP URL the linked
           file will get downloaded locally before processing.

           If the file is a zip file, the command line  utility  unzip  is  used  to  expand  the
           archive  into  files  in $TMPDIR, or /tmp if $TMPDIR is unset or set to a non-existing
           directory.

           Then the following commands are used from the top level directory  where  the  archive
           has been expanded.

       •   command [ AND command ... ]

           A  series  of  commands  against  the  contents  of  the  archive,  at the moment only
           CSV,´FIXED and DBF commands are supported.

           Note that commands are supporting the clause FROM FILENAME MATCHING which  allows  the
           pgloader command not to depend on the exact names of the archive directories.

           The  same clause can also be applied to several files with using the spelling FROM ALL
           FILENAMES MATCHING and a regular expression.

           The whole matching clause must follow the following rule:

            FROM [ ALL FILENAMES | [ FIRST ] FILENAME ] MATCHING

       •   FINALLY DO

           SQL Queries to run once the data is loaded, such as CREATE INDEX.

LOAD MYSQL DATABASE

       This command instructs pgloader  to  load  data  from  a  database  connection.  The  only
       supported database source is currently MySQL, and pgloader supports dynamically converting
       the schema of the source database and the indexes building.

       A default set of casting rules are provided and might be overloaded and appended to by the
       command.

       Here´s  an  example  using  as many options as possible, some of them even being defaults.
       Chances are you don´t need that complex a setup, don´t copy and paste it, use it only as a
       reference!

           LOAD DATABASE
                FROM      mysql://root@localhost/sakila
                INTO postgresql://localhost:54393/sakila

            WITH include drop, create tables, create indexes, reset sequences,
                 workers = 8, concurrency = 1,
                 multiple readers per thread, rows per range = 50000

             SET PostgreSQL PARAMETERS
                 maintenance_work_mem to ´128MB´,
                 work_mem to ´12MB´,
                 search_path to ´sakila, public, "$user"´

             SET MySQL PARAMETERS
                 net_read_timeout  = ´120´,
                 net_write_timeout = ´120´

            CAST type bigint when (= precision 20) to bigserial drop typemod,
                 type date drop not null drop default using zero-dates-to-null,
                 -- type tinyint to boolean using tinyint-to-boolean,
                 type year to integer

            MATERIALIZE VIEWS film_list, staff_list

            -- INCLUDING ONLY TABLE NAMES MATCHING ~/film/, ´actor´
            -- EXCLUDING TABLE NAMES MATCHING ~<ory>
            -- DECODING TABLE NAMES MATCHING ~/messed/, ~/encoding/ AS utf8
            -- ALTER TABLE NAMES MATCHING ´film´ RENAME TO ´films´
            -- ALTER TABLE NAMES MATCHING ~/_list$/ SET SCHEMA ´mv´

            ALTER TABLE NAMES MATCHING ~/_list$/, ´sales_by_store´, ~/sales_by/
             SET SCHEMA ´mv´

            ALTER TABLE NAMES MATCHING ´film´ RENAME TO ´films´
            ALTER TABLE NAMES MATCHING ~/./ SET (fillfactor=´40´)

            ALTER SCHEMA ´sakila´ RENAME TO ´pagila´

            BEFORE LOAD DO
              $$ create schema if not exists pagila; $$,
              $$ create schema if not exists mv;     $$,
              $$ alter database sakila set search_path to pagila, mv, public; $$;

       The database command accepts the following clauses and options:

       •   FROM

           Must be a connection URL pointing to a MySQL database.

           If  the  connection  URI  contains a table name, then only this table is migrated from
           MySQL to PostgreSQL.

           See the SOURCE CONNECTION STRING section  above  for  details  on  how  to  write  the
           connection        string.        Environment        variables       described       in
           http://dev.mysql.com/doc/refman/5.0/en/environment-variables.html  can  be   used   as
           default  values too. If the user is not provided, then it defaults to USER environment
           variable value. The password can be provided with the environment variable  MYSQL_PWD.
           The  host  can  be  provided  with  the  environment variable MYSQL_HOST and otherwise
           defaults to localhost.  The  port  can  be  provided  with  the  environment  variable
           MYSQL_TCP_PORT and otherwise defaults to 3306.

       •   WITH

           When  loading  from  a  MySQL  database,  the following options are supported, and the
           default WITH clause is: no truncate,  create  schema,  create  tables,  include  drop,
           create  indexes,  reset  sequences, foreign keys, downcase identifiers, uniquify index
           names.

           WITH options:

       •   include drop

           When this option is listed, pgloader drops all the tables  in  the  target  PostgreSQL
           database  whose  names  appear in the MySQL database. This option allows for using the
           same command several times in a row until you figure out  all  the  options,  starting
           automatically  from  a  clean  environment. Please note that CASCADE is used to ensure
           that tables are dropped even if there are foreign  keys  pointing  to  them.  This  is
           precisely  what  include  drop  is intended to do: drop all target tables and recreate
           them.

           Great care needs to be taken when using include  drop,  as  it  will  cascade  to  all
           objects  referencing  the  target tables, possibly including other tables that are not
           being loaded from the source DB.

       •   include no drop

           When this option is listed, pgloader will not include any DROP statement when  loading
           the data.

       •   truncate

           When  this  option  is  listed,  pgloader  issue  the  TRUNCATE  command  against each
           PostgreSQL table just before loading data into it.

       •   no truncate

           When this option is listed, pgloader issues no TRUNCATE command.

       •   disable triggers

           When this option is listed, pgloader issues an ALTER TABLE  ...  DISABLE  TRIGGER  ALL
           command  against the PostgreSQL target table before copying the data, then the command
           ALTER TABLE ... ENABLE TRIGGER ALL once the COPY is done.

           This option allows loading data into a pre-existing table  ignoring  the  foreign  key
           constraints  and  user  defined  triggers  and  may  result  in  invalid  foreign  key
           constraints once the data is loaded. Use with care.

       •   create tables

           When this option is listed, pgloader creates the table using the meta  data  found  in
           the  MySQL  file, which must contain a list of fields with their data type. A standard
           data type conversion from DBF to PostgreSQL is done.

       •   create no tables

           When this option is listed, pgloader skips the creation of table before loading  data,
           target tables must then already exist.

           Also,  when  using  create  no  tables  pgloader fetches the metadata from the current
           target database and checks type casting, then  will  remove  constraints  and  indexes
           prior to loading the data and install them back again once the loading is done.

       •   create indexes

           When  this option is listed, pgloader gets the definitions of all the indexes found in
           the MySQL database and create the same set of index definitions against the PostgreSQL
           database.

       •   create no indexes

           When this option is listed, pgloader skips the creating indexes.

       •   drop indexes

           When  this  option is listed, pgloader drops the indexes in the target database before
           loading the data, and creates them again at the end of the data copy.

       •   uniquify index names, preserve index names

           MySQL index names are unique per-table whereas in PostgreSQL index names  have  to  be
           unique  per-schema.  The default for pgloader is to change the index name by prefixing
           it with idx_OID where OID is the internal numeric identifier of the table the index is
           built against.

           In somes cases like when the DDL are entirely left to a framework it might be sensible
           for pgloader to refrain from handling index unique names, that is  achieved  by  using
           the preserve index names option.

           The default is to uniquify index names.

           Even  when  using  the  option  preserve  index names, MySQL primary key indexes named
           "PRIMARY" will get their names uniquified. Failing to do so would prevent the  primary
           keys  to  be  created  again  in  PostgreSQL  where the index names must be unique per
           schema.

       •   foreign keys

           When this option is listed, pgloader gets the definitions  of  all  the  foreign  keys
           found in the MySQL database and create the same set of foreign key definitions against
           the PostgreSQL database.

       •   no foreign keys

           When this option is listed, pgloader skips creating foreign keys.

       •   reset sequences

           When this option is listed, at the end of the data loading and after the indexes  have
           all  been created, pgloader resets all the PostgreSQL sequences created to the current
           maximum value of the column they are attached to.

           The options schema only and data only have no effects on this option.

       •   reset no sequences

           When this option is listed, pgloader skips resetting sequences after the load.

           The options schema only and data only have no effects on this option.

       •   downcase identifiers

           When this option is listed, pgloader converts  all  MySQL  identifiers  (table  names,
           index names, column names) to downcase, except for PostgreSQL reserved keywords.

           The  PostgreSQL  reserved  keywords  are  determined  dynamically  by using the system
           function pg_get_keywords().

       •   quote identifiers

           When this option is listed, pgloader quotes all MySQL identifiers so that  their  case
           is  respected.  Note  that you will then have to do the same thing in your application
           code queries.

       •   schema only

           When this option is listed pgloader refrains from migrating the data over.  Note  that
           the  schema  in  this  context includes the indexes when the option create indexes has
           been listed.

       •   data only

           When this option is listed pgloader only issues the COPY statements, without doing any
           other processing.

       •   single reader per thread, multiple readers per thread

           The  default is single reader per thread and it means that each MySQL table is read by
           a single thread as a whole, with a single SELECT statement using no WHERE clause.

           When using multiple readers per thread pgloader may be able to divide the reading work
           into  several  threads,  as many as the concurrency setting, which needs to be greater
           than 1 for this option to kick be activated.

           For each source table, pgloader searches for a  primary  key  over  a  single  numeric
           column,  or  a  multiple-column  primary  key index for which the first column is of a
           numeric data type (one of integer or bigint). When such an index exists, pgloader runs
           a  query to find the min and max values on this column, and then split that range into
           many ranges containing a maximum of rows per range.

           When the range list we  then  obtain  contains  at  least  as  many  ranges  than  our
           concurrency setting, then we distribute those ranges to each reader thread.

           So  when all the conditions are met, pgloader then starts as many reader thread as the
           concurrency setting, and each reader thread issues several queries with a WHERE id  >=
           x  AND  id < y, where y - x = rows per range or less (for the last range, depending on
           the max value just obtained.

       •   rows per range

           How many rows are fetched per SELECT query when using multiple readers per thread, see
           above for details.

       •   SET MySQL PARAMETERS

           The  SET  MySQL PARAMETERS allows setting MySQL parameters using the MySQL SET command
           each time pgloader connects to it.

       •   CAST

           The cast clause allows to specify custom casting rules, either to overload the default
           casting rules or to amend them with special cases.

           A casting rule is expected to follow one of the forms:

           type <mysql-type-name> [ <guard> ... ] to <pgsql-type-name> [ <option> ... ]
           column <table-name>.<column-name> [ <guards> ] to ...

       It´s  possible  for  a casting rule to either match against a MySQL data type or against a
       given column name in a given table name. That flexibility allows to cope with cases  where
       the  type  tinyint  might  have  been used as a boolean in some cases but as a smallint in
       others.

       The casting rules are applied in order, the first match prevents  following  rules  to  be
       applied, and user defined rules are evaluated first.

       The supported guards are:

       •   when default ´value´

           The  casting  rule  is only applied against MySQL columns of the source type that have
           given value, which must be a single-quoted or a double-quoted string.

       •   when typemod expression

           The casting rule is only applied against MySQL columns of the source type that have  a
           typemod value matching the given typemod expression. The typemod is separated into its
           precision and scale components.

           Example of a cast rule using a typemod guard:

           type char when (= precision 1) to char keep typemod

       This expression casts MySQL char(1) column to a PostgreSQL column of  type  char(1)  while
       allowing  for  the  general case char(N) will be converted by the default cast rule into a
       PostgreSQL type varchar(N).

       •   with extra auto_increment

           The casting rule is only  applied  against  MySQL  columns  having  the  extra  column
           auto_increment  option  set,  so  that it´s possible to target e.g. serial rather than
           integer.

           The default matching behavior, when this option isn´t set, is to  match  both  columns
           with the extra definition and without.

           This  means  that if you want to implement a casting rule that target either serial or
           integer from a smallint definition  depending  on  the  auto_increment  extra  bit  of
           information from MySQL, then you need to spell out two casting rules as following:

           type smallint  with extra auto_increment
             to serial drop typemod keep default keep not null,
           type smallint
             to integer drop typemod keep default keep not null

       The supported casting options are:

       •   drop default, keep default

           When the option drop default is listed, pgloader drops any existing default expression
           in the MySQL database for columns of the source type from the CREATE  TABLE  statement
           it generates.

           The  spelling  keep  default  explicitly  prevents  that  behaviour and can be used to
           overload the default casting rules.

       •   drop not null, keep not null, set not null

           When the option drop not  null  is  listed,  pgloader  drops  any  existing  NOT  NULL
           constraint  associated with the given source MySQL datatype when it creates the tables
           in the PostgreSQL database.

           The spelling keep not null explicitly prevents that  behaviour  and  can  be  used  to
           overload the default casting rules.

           When  the  option  set  not null is listed, pgloader sets a NOT NULL constraint on the
           target column regardless whether it has been set in the source MySQL column.

       •   drop typemod, keep typemod

           When the option drop typemod is listed, pgloader drops any existing typemod definition
           (e.g.  precision and scale) from the datatype definition found in the MySQL columns of
           the source type when it created the tables in the PostgreSQL database.

           The spelling keep typemod explicitly prevents  that  behaviour  and  can  be  used  to
           overload the default casting rules.

       •   using

           This  option  takes  as  its single argument the name of a function to be found in the
           pgloader.transforms Common Lisp package. See above for details.

           It´s possible to augment a default cast rule (such as one that  applies  against  ENUM
           data  type  for  example) with a transformation function by omitting entirely the type
           parts of the casting rule, as in the following example:

           column enumerate.foo using empty-string-to-null

       •   MATERIALIZE VIEWS

           This clause allows you to implement custom data  processing  at  the  data  source  by
           providing  a  view  definition  against  which  pgloader will query the data. It´s not
           possible to just allow for plain SQL because we want to know a  lot  about  the  exact
           data types of each column involved in the query output.

           This  clause  expect a comma separated list of view definitions, each one being either
           the name of an existing view in your database or the following expression:

           name AS $$ sql query $$

           The name and the sql query will be used in a CREATE VIEW statement at the beginning of
           the  data  loading, and the resulting view will then be dropped at the end of the data
           loading.

       •   MATERIALIZE ALL VIEWS

           Same behaviour as MATERIALIZE VIEWS using the dynamic list of  views  as  returned  by
           MySQL rather than asking the user to specify the list.

       •   INCLUDING ONLY TABLE NAMES MATCHING

           Introduce  a  comma  separated list of table names or regular expression used to limit
           the tables to migrate to a sublist.

           Example:

           INCLUDING ONLY TABLE NAMES MATCHING ~/film/, ´actor´

       •   EXCLUDING TABLE NAMES MATCHING

           Introduce a comma separated list of table names or regular expression used to  exclude
           table  names  from  the  migration.  This  filter  only  applies  to the result of the
           INCLUDING filter.

           EXCLUDING TABLE NAMES MATCHING ~<ory>

       •   DECODING TABLE NAMES MATCHING

           Introduce a comma separated list of table names or regular expressions used  to  force
           the encoding to use when processing data from MySQL. If the data encoding known to you
           is different from MySQL´s idea about it, this is the option to use.

           DECODING TABLE NAMES MATCHING ~/messed/, ~/encoding/ AS utf8

       You can use as many such rules as you need, all with possibly different encodings.

       •   ALTER TABLE NAMES MATCHING

           Introduce a comma separated list of table names or regular expressions that  you  want
           to  target in the pgloader ALTER TABLE command. The only two available actions are SET
           SCHEMA and RENAME TO, both take a quoted string as parameter:

           ALTER TABLE NAMES MATCHING ~/_list$/, ´sales_by_store´, ~/sales_by/
            SET SCHEMA ´mv´

           ALTER TABLE NAMES MATCHING ´film´ RENAME TO ´films´

           ALTER TABLE NAMES MATCHING ~/./ SET (fillfactor=´40´)

       You can use as many such rules as you need. The list of tables to be migrated is  searched
       in  pgloader  memory against the ALTER TABLE matching rules, and for each command pgloader
       stops at the first matching criteria (regexp or string).

       No ALTER TABLE command is sent to PostgreSQL, the modification happens at the level of the
       pgloader  in-memory  representation  of  your  source  database  schema. In case of a name
       change, the mapping is kept and reused in the foreign key and index support.

       The SET () action takes effect as a WITH clause for the CREATE TABLE command that pgloader
       will run when it has to create a table.

   LIMITATIONS
       The  database  command currently only supports MySQL source database and has the following
       limitations:

       •   Views are not migrated,

           Supporting views might require implementing a full SQL parser for  the  MySQL  dialect
           with  a  porting  engine  to  rewrite  the  SQL against PostgreSQL, including renaming
           functions and changing some constructs.

           While it´s not theoretically impossible, don´t hold your breath.

       •   Triggers are not migrated

           The difficulty of doing so is not yet assessed.

       •   ON UPDATE CURRENT_TIMESTAMP is currently not migrated

           It´s simple enough to implement, just not on the priority list yet.

       •   Of the geometric datatypes, only the POINT database has been covered. The  other  ones
           should be easy enough to implement now, it´s just not done yet.

   DEFAULT MySQL CASTING RULES
       When migrating from MySQL the following Casting Rules are provided:

       Numbers:

       •   type int with extra auto_increment to serial when (< precision 10)

       •   type int with extra auto_increment to bigserial when (<= 10 precision)

       •   type int to int when (< precision 10)

       •   type int to bigint when (<= 10 precision)

       •   type tinyint with extra auto_increment to serial

       •   type smallint with extra auto_increment to serial

       •   type mediumint with extra auto_increment to serial

       •   type bigint with extra auto_increment to bigserial

       •   type tinyint to boolean when (= 1 precision) using tinyint-to-boolean

       •   type tinyint to smallint drop typemod

       •   type smallint to smallint drop typemod

       •   type mediumint to integer drop typemod

       •   type integer to integer drop typemod

       •   type float to float drop typemod

       •   type bigint to bigint drop typemod

       •   type double to double precision drop typemod

       •   type numeric to numeric keep typemod

       •   type decimal to decimal keep typemod

       Texts:

       •   type char to char keep typemod using remove-null-characters

       •   type varchar to varchar keep typemod using remove-null-characters

       •   type tinytext to text using remove-null-characters

       •   type text to text using remove-null-characters

       •   type mediumtext to text using remove-null-characters

       •   type longtext to text using remove-null-characters

       Binary:

       •   type binary to bytea

       •   type varbinary to bytea

       •   type tinyblob to bytea

       •   type blob to bytea

       •   type mediumblob to bytea

       •   type longblob to bytea

       Date:

       •   type  datetime when default "0000-00-00 00:00:00" and not null to timestamptz drop not
           null drop default using zero-dates-to-null

       •   type datetime when default "0000-00-00 00:00:00" to  timestamptz  drop  default  using
           zero-dates-to-null

       •   type timestamp when default "0000-00-00 00:00:00" and not null to timestamptz drop not
           null drop default using zero-dates-to-null

       •   type timestamp when default "0000-00-00 00:00:00" to timestamptz  drop  default  using
           zero-dates-to-null

       •   type date when default "0000-00-00" to date drop default using zero-dates-to-null

       •   type date to date

       •   type datetime to timestamptz

       •   type timestamp to timestamptz

       •   type year to integer drop typemod

       Geometric:

       •   type point to point using pgloader.transforms::convert-mysql-point

       Enum  types  are  declared  inline  in  MySQL and separately with a CREATE TYPE command in
       PostgreSQL, so each column of Enum Type is converted to a type named after the  table  and
       column names defined with the same labels in the same order.

       When  the  source  type  definition is not matched in the default casting rules nor in the
       casting rules provided in the command, then the type name with the typemod is used.

LOAD SQLite DATABASE

       This command instructs pgloader to load data from a SQLite file.  Automatic  discovery  of
       the schema is supported, including build of the indexes.

       Here´s an example:

           load database
                from sqlite:///Users/dim/Downloads/lastfm_tags.db
                into postgresql:///tags

            with include drop, create tables, create indexes, reset sequences

             set work_mem to ´16MB´, maintenance_work_mem to ´512 MB´;

       The sqlite command accepts the following clauses and options:

       •   FROM

           Path or HTTP URL to a SQLite file, might be a .zip file.

       •   WITH

           When loading from a SQLite database, the following options are supported:

           When  loading  from  a  SQLite  database, the following options are supported, and the
           default WITH clause is: no truncate, create  tables,  include  drop,  create  indexes,
           reset sequences, downcase identifiers, encoding ´utf-8´.

       •   include drop

           When  this  option  is  listed, pgloader drops all the tables in the target PostgreSQL
           database whose names appear in the SQLite database. This option allows for  using  the
           same  command  several  times  in a row until you figure out all the options, starting
           automatically from a clean environment. Please note that CASCADE  is  used  to  ensure
           that  tables  are  dropped  even  if  there are foreign keys pointing to them. This is
           precisely what include drop is intended to do: drop all  target  tables  and  recreate
           them.

           Great  care  needs  to  be  taken  when  using include drop, as it will cascade to all
           objects referencing the target tables, possibly including other tables  that  are  not
           being loaded from the source DB.

       •   include no drop

           When  this option is listed, pgloader will not include any DROP statement when loading
           the data.

       •   truncate

           When this  option  is  listed,  pgloader  issue  the  TRUNCATE  command  against  each
           PostgreSQL table just before loading data into it.

       •   no truncate

           When this option is listed, pgloader issues no TRUNCATE command.

       •   disable triggers

           When  this  option  is  listed, pgloader issues an ALTER TABLE ... DISABLE TRIGGER ALL
           command against the PostgreSQL target table before copying the data, then the  command
           ALTER TABLE ... ENABLE TRIGGER ALL once the COPY is done.

           This  option  allows  loading  data into a pre-existing table ignoring the foreign key
           constraints  and  user  defined  triggers  and  may  result  in  invalid  foreign  key
           constraints once the data is loaded. Use with care.

       •   create tables

           When  this  option  is listed, pgloader creates the table using the meta data found in
           the SQLite file, which must contain a list of fields with their data type. A  standard
           data type conversion from DBF to PostgreSQL is done.

       •   create no tables

           When  this option is listed, pgloader skips the creation of table before loading data,
           target tables must then already exist.

           Also, when using create no tables pgloader  fetches  the  metadata  from  the  current
           target  database  and  checks  type  casting, then will remove constraints and indexes
           prior to loading the data and install them back again once the loading is done.

       •   create indexes

           When this option is listed, pgloader gets the definitions of all the indexes found  in
           the  SQLite  database  and  create  the  same  set  of  index  definitions against the
           PostgreSQL database.

       •   create no indexes

           When this option is listed, pgloader skips the creating indexes.

       •   drop indexes

           When this option is listed, pgloader drops the indexes in the target  database  before
           loading the data, and creates them again at the end of the data copy.

       •   reset sequences

           When  this option is listed, at the end of the data loading and after the indexes have
           all been created, pgloader resets all the PostgreSQL sequences created to the  current
           maximum value of the column they are attached to.

       •   reset no sequences

           When this option is listed, pgloader skips resetting sequences after the load.

           The options schema only and data only have no effects on this option.

       •   schema only

           When  this  option  is listed pgloader will refrain from migrating the data over. Note
           that the schema in this context includes the indexes when the  option  create  indexes
           has been listed.

       •   data only

           When this option is listed pgloader only issues the COPY statements, without doing any
           other processing.

       •   encoding

           This option allows to control which encoding to  parse  the  SQLite  text  data  with.
           Defaults to UTF-8.

       •   CAST

           The cast clause allows to specify custom casting rules, either to overload the default
           casting rules or to amend them with special cases.

           Please refer to the MySQL CAST clause for details.

       •   INCLUDING ONLY TABLE NAMES LIKE

           Introduce a comma separated list of table name patterns used to limit  the  tables  to
           migrate to a sublist.

           Example:

           INCLUDING ONLY TABLE NAMES LIKE ´Invoice%´

       •   EXCLUDING TABLE NAMES LIKE

           Introduce  a  comma  separated list of table name patterns used to exclude table names
           from the migration. This filter only applies to the result of the INCLUDING filter.

           EXCLUDING TABLE NAMES LIKE ´appointments´

   DEFAULT SQLite CASTING RULES
       When migrating from SQLite the following Casting Rules are provided:

       Numbers:

       •   type tinyint to smallint using integer-to-string

       •   type integer to bigint using integer-to-string

       •   type float to float using float-to-string

       •   type real to real using float-to-string

       •   type double to double precision using float-to-string

       •   type numeric to numeric using float-to-string

       Texts:

       •   type character to text drop typemod

       •   type varchar to text drop typemod

       •   type nvarchar to text drop typemod

       •   type char to text drop typemod

       •   type nchar to text drop typemod

       •   type nvarchar to text drop typemod

       •   type clob to text drop typemod

       Binary:

       •   type blob to bytea

       Date:

       •   type datetime to timestamptz using sqlite-timestamp-to-timestamp

       •   type timestamp to timestamptz using sqlite-timestamp-to-timestamp

       •   type timestamptz to timestamptz using sqlite-timestamp-to-timestamp

LOAD MS SQL DATABASE

       This command instructs pgloader to load data from a MS SQL database.  Automatic  discovery
       of  the  schema  is  supported,  including  build of the indexes, primary and foreign keys
       constraints.

       Here´s an example:

           load database
                from mssql://user@host/dbname
                into postgresql:///dbname

           including only table names like ´GlobalAccount´ in schema ´dbo´

           set work_mem to ´16MB´, maintenance_work_mem to ´512 MB´

           before load do $$ drop schema if exists dbo cascade; $$;

       The mssql command accepts the following clauses and options:

       •   FROM

           Connection string to an existing MS SQL  database  server  that  listens  and  welcome
           external TCP/IP connection. As pgloader currently piggybacks on the FreeTDS driver, to
           change the port of the server please export the TDSPORT environment variable.

       •   WITH

           When loading from a MS SQL database, the same options as when loading a MySQL database
           are supported. Please refer to the MySQL section. The following options are added:

       •   create schemas

           When  this  option is listed, pgloader creates the same schemas as found on the MS SQL
           instance. This is the default.

       •   create no schemas

           When this option is listed, pgloader refrains from creating any schemas  at  all,  you
           must then ensure that the target schema do exist.

       •   CAST

           The cast clause allows to specify custom casting rules, either to overload the default
           casting rules or to amend them with special cases.

           Please refer to the MySQL CAST clause for details.

       •   INCLUDING ONLY TABLE NAMES LIKE ´...´ [, ´...´] IN SCHEMA ´...´

           Introduce a comma separated list of table name patterns used to limit  the  tables  to
           migrate  to a sublist. More than one such clause may be used, they will be accumulated
           together.

           Example:

           including only table names lile ´GlobalAccount´ in schema ´dbo´

       •   EXCLUDING TABLE NAMES LIKE ´...´ [, ´...´] IN SCHEMA ´...´

           Introduce a comma separated list of table name patterns used to  exclude  table  names
           from the migration. This filter only applies to the result of the INCLUDING filter.

           EXCLUDING TABLE NAMES MATCHING ´LocalAccount´ in schema ´dbo´

       •   ALTER SCHEMA ´...´ RENAME TO ´...´

           Allows  to rename a schema on the flight, so that for instance the tables found in the
           schema ´dbo´ in your source database will get migrated into the schema ´public´ in the
           target database with this command:

           ALTER SCHEMA ´dbo´ RENAME TO ´public´

       •   ALTER TABLE NAMES MATCHING ... IN SCHEMA ´...´

           See  the  MySQL explanation for this clause above. It works the same in the context of
           migrating from MS SQL, only with the added option to specify the name  of  the  schema
           where to find the definition of the target tables.

           The matching is done in pgloader itself, with a Common Lisp regular expression lib, so
           doesn´t depend on the LIKE implementation of MS SQL, nor on the lack  of  support  for
           regular expressions in the engine.

   Driver setup and encoding
       pgloader is using the FreeTDS driver, and internally expects the data to be sent in utf-8.
       To achieve that, you can configure the FreeTDS driver with those  defaults,  in  the  file
       ~/.freetds.conf:

           [global]
               tds version = 7.4
               client charset = UTF-8

   DEFAULT MS SQL CASTING RULES
       When migrating from MS SQL the following Casting Rules are provided:

       Numbers:

       •   type tinyint to smallint

       •   type float to float using float-to-string

       •   type real to real using float-to-string

       •   type double to double precision using float-to-string

       •   type numeric to numeric using float-to-string

       •   type decimal to numeric using float-to-string

       •   type money to numeric using float-to-string

       •   type smallmoney to numeric using float-to-string

       Texts:

       •   type char to text drop typemod

       •   type nchat to text drop typemod

       •   type varchar to text drop typemod

       •   type nvarchar to text drop typemod

       •   type xml to text drop typemod

       Binary:

       •   type binary to bytea using byte-vector-to-bytea

       •   type varbinary to bytea using byte-vector-to-bytea

       Date:

       •   type datetime to timestamptz

       •   type datetime2 to timestamptz

       Others:

       •   type bit to boolean

       •   type hierarchyid to bytea

       •   type geography to bytea

       •   type uniqueidentifier to uuid using sql-server-uniqueidentifier-to-uuid

TRANSFORMATION FUNCTIONS

       Some  data  types are implemented in a different enough way that a transformation function
       is necessary. This function must be  written  in  Common  lisp  and  is  searched  in  the
       pgloader.transforms package.

       Some  default  transformation  function  are  provided  with pgloader, and you can use the
       --load command line option to load and  compile  your  own  lisp  file  into  pgloader  at
       runtime.  For  your  functions  to  be  found,  remember  to begin your lisp file with the
       following form:

           (in-package #:pgloader.transforms)

       The provided transformation functions are:

       •   zero-dates-to-null

           When the input date is all zeroes, return nil, which gets loaded as a PostgreSQL  NULL
           value.

       •   date-with-no-separator

           Applies zero-dates-to-null then transform the given date into a format that PostgreSQL
           will actually process:

           In:  "20041002152952"
           Out: "2004-10-02 15:29:52"

       •   time-with-no-separator

           Transform the given time into a format that PostgreSQL will actually process:

           In:  "08231560"
           Out: "08:23:15.60"

       •   tinyint-to-boolean

           As MySQL lacks a proper boolean type, tinyint is often used to  implement  that.  This
           function transforms 0 to ´false´ and anything else to ´true´.

       •   bits-to-boolean

           As  MySQL  lacks  a  proper  boolean  type,  BIT is often used to implement that. This
           function transforms 1-bit bit vectors from 0 to f and any other value to t..

       •   int-to-ip

           Convert an integer into a dotted representation of an ip4.

           In:  18435761
           Out: "1.25.78.177"

       •   ip-range

           Converts a couple of integers given as strings into a range of ip4.

           In:  "16825344" "16825599"
           Out: "1.0.188.0-1.0.188.255"

       •   convert-mysql-point

           Converts from  the  astext  representation  of  points  in  MySQL  to  the  PostgreSQL
           representation.

           In:  "POINT(48.5513589 7.6926827)"
           Out: "(48.5513589,7.6926827)"

       •   integer-to-string

           Converts  a  integer  string  or  a  Common  Lisp integer into a string suitable for a
           PostgreSQL integer. Takes care of quoted integers.

           In:  "\"0\""
           Out: "0"

       •   float-to-string

           Converts a Common Lisp float into a string suitable for a PostgreSQL float:

           In:  100.0d0
           Out: "100.0"

       •   set-to-enum-array

           Converts a string representing a MySQL SET into a PostgreSQL Array of Enum values from
           the set.

           In: "foo,bar"
           Out: "{foo,bar}"

       •   empty-string-to-null

           Convert an empty string to a null.

       •   right-trim

           Remove whitespace at end of string.

       •   remove-null-characters

           Remove NUL characters (0x0) from given strings.

       •   byte-vector-to-bytea

           Transform  a  simple  array  of  unsigned  bytes  to  the  PostgreSQL bytea Hex Format
           representation                   as                   documented                    at
           http://www.postgresql.org/docs/9.3/interactive/datatype-binary.htmlsqlite-timestamp-to-timestamp

           SQLite  type  system  is  quite interesting, so cope with it here to produce timestamp
           literals as expected by PostgreSQL. That covers year only on  4  digits,  0  dates  to
           null, and proper date strings.

       •   sql-server-uniqueidentifier-to-uuid

           The  SQL  Server  driver receives data fo type uniqueidentifier as byte vector that we
           then need to convert to an UUID string for PostgreSQL COPY input format to process.

       •   unix-timestamp-to-timestamptz

           Converts a unix timestamp (number of seconds elapsed since beginning of 1970)  into  a
           proper PostgreSQL timestamp format.

       •   varbinary-to-string

           Converts  binary  encoded  string (such as a MySQL varbinary entry) to a decoded text,
           using the table´s encoding that may  be  overloaded  with  the  DECODING  TABLE  NAMES
           MATCHING clause.

LOAD MESSAGES

       This  command  is  still experimental and allows receiving messages via UDP using a syslog
       like format, and, depending on rule matching, loads named portions of the data stream into
       a destination table.

           LOAD MESSAGES
               FROM syslog://localhost:10514/

            WHEN MATCHES rsyslog-msg IN apache
             REGISTERING timestamp, ip, rest
                    INTO postgresql://localhost/db?logs.apache
                     SET guc_1 = ´value´, guc_2 = ´other value´

            WHEN MATCHES rsyslog-msg IN others
             REGISTERING timestamp, app-name, data
                    INTO postgresql://localhost/db?logs.others
                     SET guc_1 = ´value´, guc_2 = ´other value´

               WITH apache = rsyslog
                    DATA   = IP REST
                    IP     = 1*3DIGIT "." 1*3DIGIT "."1*3DIGIT "."1*3DIGIT
                    REST   = ~/.*/

               WITH others = rsyslog;

       As  the  command  is still experimental the options might be changed in the future and the
       details are not documented.

AUTHOR

       Dimitri Fontaine dimitri@2ndQuadrant.fr

SEE ALSO

       PostgreSQL COPY documentation at http://www.postgresql.org/docs/9.3/static/sql-copy.html.

       The pgloader source code, binary packages, documentation and examples  may  be  downloaded
       from http://pgloader.io/.