Provided by: plucker_1.8-34_amd64
NAME
plucker-build - generate a document (e-book) in Plucker format
SYNOPSIS
plucker-build [--alt-maxheight=pixel-height] [--alt-maxwidth=pixel-width] [--author=string] [--backup] [--beamable] [--bpp=image-depth] [--category=default- category-name] [--charset=charset-indicator] [--compression=compression-type] [--depth- first] [--doc-file=name-prefix] [--doc-name=document-name] [--doc-compression] [--exclusion-list=filename] [--extra-section=section-name] [--help] [--home-url=base-URL] [--icon=image-filename] [--launchable] [--maxdepth=depth] [--maxheight=pixel-height] [--maxwidth=pixel-width] [--no-backup] [--noimages] [--not-beamable] [--not-launchable] [--no-urlinfo] [--owner-id=name] [--pluckerdir=output-directory] [--pluckerhome=plucker- home-directory] [--quiet] [--referrer=string] [--status-file=filename] [--staybelow=url- prefix] [--stayonhost] [--title=string] [--update-cache] [--url-pattern=pattern] [--user- agent=string] [--verbosity=verbosity-level] [--zlib-compression] [HOME-URL]
DESCRIPTION
plucker-build creates a Plucker binary document, which is a kind of e-book, from a URL. This document is formatted for the Plucker viewer program, which currently runs on Palm devices. The normal mode of operation is to take a home URL and 'pluck' it to produce a Plucker document, either to stdout, or to a file if --doc-file is specified. Alternatively, specifying the option --update-cache will update a cache of Plucker records (though it's not clear what this is good for). The Plucker document format is specified at http://www.plkr.org/index.pl/cvs/docs/DBFormat.html?rev=HEAD.
OPTIONS
Many options are also available as parameters in the configuration file $HOME/.pluckerrc, or in the default configuration file. Where applicable, the name of the configuration file parameter is shown after the documentation on the option. An option given on the command line will override any configuration file parameter. For more on configuration files, see below. --alt-maxheight=pixel-height Specifies the maximum height, in pixels, of the alternate rendition of an image. (When inline images are too large to be included full-size, they are converted into smaller versions, with sizes governed by the MAXHEIGHT and MAXWIDTH parameters, and are linked to larger renditions of the images, called the alternate rendition.) [alt_maxheight] --alt-maxwidth=pixel-width Specifies the maximum width, in pixels, of the alternate rendition of an image. (When inline images are too large to be included full-size, they are converted into smaller versions, with sizes governed by the MAXHEIGHT and MAXWIDTH parameters, and are linked to larger renditions of the images, called the alternate rendition.) [alt_maxwidth] --author=string Sets the author of the document to string, which is assumed to be in the charset of the document (see --charset), or ASCII if no charset is specified. [author_md] --backup Sets the bit in the output file that causes the document to be backed up on Palm HotSync. By default, the document is backed up. [backup_bit] --beamable Sets the bit in the output file that allows the document to be beamed. By default, the document is beamable. [copyprevention_bit] --bpp=image-depth Specifies the number of bits-per-pixel to be used for images. Valid values as of Plucker 1.1 are 0, 1 (the default), 2, 4, or 8. If 0 is specified, no images will be included in the document. See also --noimages. [bpp] --category=default-category-name Specifies a default Plucker category or categories to include in the document. If more than one category is specified, the category names should be separated by semicolons. [category] --charset=charset-indicator Specifies the default character set encoding used in the text of the documents being plucked. charset-indicator is either a charset name (from a small list; see src/parser/python/PyPlucker/__init__.py.in for a list of valid names), or a decimal integer indicating the charset's MIBenum value, as shown in the table at http://www.iana.org/assignments/character-sets. [default_charset] --compression=compression-type Specifies the type of compression to use in the document. There are two possible values for compression-type: doc or zlib. The default is doc, which is the same compression system used in Palm DOC-format documents. zlib compression usually results in smaller documents. See also --zlib-compression and --doc-compression. [compression] --depth-first Specifies a depth-first traversal of the web graph, rather than the default breadth-first traversal. This often works better on bushy acyclic graph structures than the breadth-first traversal. [depth_first] --doc-file=name-prefix (or -f name-prefix) also as -f name-prefix. Specifies the name of the document output file, without the directory (specified with --pluckerdir) or extension (always .pdb). If not specified, and if stdout is not a tty, the document will be written to stdout. [doc_file] --doc-name=document-name (or -N document-name) Specifies the short name by which the document will be identified in the viewer. Defaults to value of --doc-file. If --doc-file is not specified, the document name defaults to the home URL. This name should be limited to 26 characters. [doc_name] --doc-compression Specifies that Doc compression, the compression scheme developed for the Palm DOC format, should be used for the parts of this document. This is the default. See also --zlib-compression and --compression. --exclusion-list=filename (or -E filename) Used to add additional files to the the exclusion list, a list of files containing information on URLs to exclude from the document. See the User's Guide for more information on exclusion lists. [exclusion_lists] --extra-section=section-name (or -s section-name) Used to add additional sections to the list to searched sections in the configuration files. A section is a named set of configuration information. By default, the DEFAULT section will be searched, then any operating-system-specific sections, then any sections specified on the command line. --help (or -h) Outputs help on command-line parameters. --home-url=base-url (or -H base-URL) Specifies the URL from which the document is to be constructed. This may also be specified as a single argument on the command line. If a home URL is not specified, it will default to file:/$HOME/.plucker/home.html. This default may be changed in your .pluckerrc file. Note that this value must be a valid absolute URL. A special URL scheme is supported, plucker:. This specifies files on the Plucker search path, which consists of PluckerDir (the Plucker current working directory) followed by PluckerHome (the Plucker home directory). [home_url] --icon=image-filename If the output file is launchable, this switch can be used to specify the large icon shown in the launcher for the document. If not specified, a default icon is used. If the output file is not launchable, this switch has no effect. See also --launchable. [big_icon] --launchable Specifies that the output document should be shown as an icon in the system launcher. Clicking on the icon will start Plucker and select this document. By default, documents are not launchable. [launchable_bit] --maxdepth=depth (or -M depth) This specifies the number of levels of links the parser will traverse when converting the input. It is best to keep this value small, or the size of your document can get very large. If you want just a page, but none of the pages pointed to by that page, use a value of 1. [home_maxdepth] --maxheight=pixel-height Specifies the maximum height, in pixels, for an inline image. Overrides the MAXHEIGHT parameter in the configuration file, but is in turn overridden by any height specification in the image link itself. [maxheight] --maxwidth=pixel-width Specifies the maximum width, in pixels, for an inline image. Overrides the MAXWIDTH parameter in the configuration file, but is in turn overridden by any width specification in the image link itself. [maxwidth] --no-backup Clears the bit in the output file that causes the document to be backed up on Palm HotSync. By default, the document is backed up. [backup_bit] --noimages Specifies that no images will be included. Identical to --bpp=0. See also --bpp. --not-beamable Sets the bit in the output file that prevents the document from being beamed. By default, the document is beamable. [copyprevention_bit] --not-launchable Specifies that the output document should not be shown as an icon in the system launcher. By default, documents are not launchable. [launchable_bit] --no-urlinfo Specifies that no URL information will be included in the document. When links are included in documents, the information about the actual URL is included by default. This is often handy for external references (links to documents not included in the document). Use of this option may result in a slightly smaller document. [no_urlinfo] --owner-id=name Specifies an owner-id for the document. This causes the document to be lightly encrypted in such a way that it will only open on a device with a matching owner- id. With the PalmOS viewer, the HotSync UserName is used as the owner-id. [owner_id_build] --pluckerhome=plucker-home-directory (or -P plucker-home-directory) Overrides the default value for PluckerHome, which is $HOME/.plucker/. Can also be specified by setting the environment variable PLUCKERHOME. An explicit value for --pluckerhome overrides any setting of PLUCKERHOME. [PLUCKERHOME] --pluckerdir=output-directory (or -p output-directory) Overrides the default value for PluckerDir, which defaults to PluckerHome (see --pluckerhome). PluckerDir is the default directory to which output documents will be written, and which will be searched for input files if the plucker: URL scheme is used. [pluckerdir] --quiet (or -q) Same as --verbosity=0. --referrer=string When using HTTP to gather input, send string as the value of the Referrer HTTP header. Default is to send no referrer header. [referrer] --status-file=filename Gives the name of a file to read to get an estimate for the total number of pages that have to be processed, and to continually write with a single line giving the number of pages collected so far, the number of links still to process, and the estimated number of total pages that will be gathered (or zero if this is not known). The three values are written as space-separated ASCII numbers. The status line in the file is continually over-written as the pluck progresses, so the file will always contain only a single line. [status_file] --staybelow=url-prefix Automatically excludes all URLs that do not start with url-prefix. A handy way to process a subtree. [home_staybelow] --stayondomain Specifies that no web hosts other than those in the same domain as the original base URL will be visited for parts of the document. [home_stayondomain] --stayonhost Specifies that no web hosts other than that named in the original base URL will be visited for parts of the document. [home_stayonhost] --title=string Sets the title of the document to string. This is different from the name of the document (see --doc-name=) in that it may be relatively long. The string is assumed to be in the charset of the document (see --charset), or ASCII if no charset is specified. [title_md] --update-cache (or -c) Update the Plucker cache of records, rather than build a document. [use_cache] --url-pattern=pattern Automatically excludes all URLs that do not match the regular expression pattern. The regular expression language used is that of the Python 're' module, as specified in http://www.python.org/doc/current/lib/re-syntax.html. [home_url_pattern] --user-agent=string When using HTTP to gather input, send string as the value of the User-Agent HTTP header. Default is to send "Plucker/Py-XX", where XX is the Plucker version. [user_agent] --verbosity=verbosity-level (or -V verbosity-level) Sets the level of status information output to the value specified by verbosity- level. Appropriate values are 0, for total silence, 1, for standard progress status (the default value), and 2, for lots of output about gathering and parsing the input (usually reserved for debugging). Values larger than 2 will also work, but tend to give profuse output that's only useful to developers. See also --quiet. [verbosity] --zlib-compression Specifies that Zlib compression should be used for the parts of this document. This is considerably more efficient than the default compression format, Doc compression. See also --doc-compression and --compression.
EXAMPLES
To build a pocket version of the weekly cafeteria menu at the foo.com cafeteria, available on the Web at http://www.foo.com/ops/cafe/weeklymenu.html, without following any links, and without including any images, and naming the document "Cafeteria Menu", and putting the document in a file named /tmp/Menu.pdb, one would say: % plucker-build http://www.foo.com/cafe/weeklymenu.html >/tmp/Menu.pdb Or alternatively, % plucker-build --pluckerdir=/tmp \ --doc-name="Cafeteria Menu" \ --doc-file=Menu \ --home-url="http://www.foo.com/cafe/weeklymenu.html" \ --maxdepth=1 \ --bpp=0 Pluckerdir is '/tmp'... ---- 0 collected, 1 to do ---- Processing http://www.foo.com/cafe/weeklymenu.html... Retrieved ok. Parsed ok. ---- all pages retrieved and parsed ---- Writing out collected data... Writing document 'Cafeteria Menu' to file /tmp/Menu.pdb Converting http://www.foo.com/cafe/weeklymenu.html... Wrote 1 <= plucker:/~special~/index Wrote 2 <= http://www.foo.com/cafe/weeklymenu.html Wrote 3 <= plucker:/~special~/pluckerlinks Wrote 5 <= plucker:/~special~/metadata Wrote 11 <= plucker:/~special~/links1 Done! % ls -l /tmp/Menu.pdb -rw-rw-r-- 1 user somegroup 2646 Nov 2 21:19 /tmp/Menu.pdb %
ENVIRONMENT VARIABLES
HOME Used to determine the location of the user's configuration file. If not set, the system-wide configuration file is used. HTTP_PROXY, HTTP_PROXY_USER, HTTP_PROXY_PASS If set, will be used to retrieve URLs with the http URL scheme. PLUCKERHOME Specifies value for PluckerHome. See the option --pluckerhome for more details. PLUCKERDIR Specifies value for PluckerDir. See the option --pluckerdir for more details.
CONFIGURATION FILES
Two configuration files are examined for customized settings of the various plucker-build parameters. The first is a system-wide configuration file, by default /usr/local/etc/pluckerrc, or /etc/pluckerrc in your Debian system. Any settings in this may be overridden with a personal configuration file, $HOME/.pluckerrc. Both files contain any number of sections, each of which may contain any number of configuration parameter settings. Each section has a name, which is enclosed in square brackets, followed by parameter settings. Normally, only the section called "default" will be examined. Extra sections may be specified with the --extra-section option to plucker- build; settings in these sections will override values in the default section. Parameter settings have the form form name = value, where name is the name of a plucker- build parameter, and value is a string, integer, floating-point, or boolean value. A colon character (:) may be used instead of the equals sign to separate name and value. Comments may be expressed by starting any line with the characters "rem", or with the character "#", or with the character ";". Boolean values of True may be expressed with "TRUE", "true", "True", "on", or "1". Boolean values of False may be expressed with "FALSE", "false", "False", "off", or "0". Configuration sections are often useful for specific often-used groups of options. It's possible to define these options in a section of the configuration file, and then just specify the section as the argument to plucker-build; the other options can all be drawn from the section. The following parameters are understood: PLUCKERHOME See option --pluckerhome. alt_maxheight See option --alt-maxheight. alt_maxwidth See option --alt-maxwidth. anchor_color A color to draw all links in, expressed as one of the 16 standard Web color names, or in the Web standard RGB color notation. See the HTTP 4.0.1 specification for more details on allowed color names and RGB notation. author_md See option --author. auto_scale_images A boolean; if true, plucker-build will automatically attempt to convert images which are too large to include in the document, to a smaller form which will fit in the document. Defaults to false. backup_bit See option --backup. big_icon See option --icon. bmp_to_tbmp Name of the bmp2tbmp program in Windows. Defaults to Bmp2Tbmp.exe. bmp_to_tbmp_parameter Parameter for the bmp2tbmp program in the Windows ImageMagick image parser. bpp See option --bpp. cache_dir_name Specify the subdirectory of PluckerDir to use for cache storage. The default is "cache". category See option --category. color_paragraphs Boolean; if set, will insert a specific foreground color at beginning of every paragraph. Shouldn't be necessary, and defaults to off. compression See option --compression. convert_program If using the deprecated imagemagick image parser, the name of the convert program. Defaults to convert (convert.exe for Windows). convert_program_parameter Parameter for the Windows ImageMagick image parser's use of convert. copyprevention_bit See option --beamable. db_file Deprecated alternative to doc_file. May disappear in any release. db_name Deprecated alternative to doc_name. May disappear in any release. default_charset See option --charset. depth_first See option --depth-first. djpeg_program Name of the djpeg program. Defaults to djpeg. Used by the netpbm2 image parser. doc_file See option --doc-file. doc_name See option --doc-name. exclusion_lists See option --exclusion-list. If multiple files are specified here, they should be separated by the appropriate separator character for your operating system (a colon on Unix platforms, a semicolon on Windows platforms). filename_extension Extension to use for the filename. Defaults to pdb. Another possibility is plkr. giftopnm_program Name of program used to convert GIF image files to PNM image files. Used by the netpbm and netpbm2 image parsers. Defaults to giftopnm. guess_tbmp_size Boolean, defaults to on. Used by the Windows image parser. home_maxdepth See option --maxdepth. home_staybelow See option --staybelow. home_stayondomain See option --stayondomain. home_stayonhost See option --stayonhost. home_url See option --home-url. home_url_pattern See option --url-pattern. http_proxy String giving any HTTP proxy server to use. Sets the environment variable HTTP_PROXY to this value. http_proxy_pass String giving a password for any HTTP proxy. Sets the environment variable HTTP_PROXY_PASS to this value. http_proxy_user String giving a username for any HTTP proxy. Sets the environment variable HTTP_PROXY_USER to this value. image_compression_limit Integer giving the minimum number of image bytes to compress. Defaults to 0. Images smaller than this will not be compressed. image_parser String specifying which image parser to use. If not specified, a working default will be used. It's suggested that you not specify this configuration parameter unless you know what you are doing. Acceptable values are netpbm2, pil2, imagemagick2, netpbm (deprecated), pil (deprecated), imagemagick (deprecated), windowspil, windows (deprecated). This value is ignored in the Java version of plucker-build. imagemagick_convert_command Identifies the ImageMagick convert program in the imagemagick2 image parser. Defaults to convert. indent_paragraphs Boolean which when set will cause paragraphs to have leading indentation, but no extra leading space. Defaults to off. launchable_bit See option --launchable. max_tbmp_size Integer, maximum size for an image in the windows image parser. maxheight See option --maxheight. maxwidth See option --maxwidth. no_dithering_in_java_image_quantization Boolean, used in the Java plucker-build image parser to turn off dithering when an image is being quantized to the fixed set of colors used in Palm grayscale or eight-bit colormaps. Defaults to false. no_urlinfo See option --no-urlinfo. owner_id_build See option --owner-id. palm1bit_graymap_file String, used by the netpbm2 and netpbm image parsers to get the location of the Palm colormap file. palm2bit_graymap_file String, used by the netpbm2 and netpbm image parsers to get the location of the Palm colormap file. palm4bit_graymap_file String, used by the netpbm2 and netpbm image parsers to get the location of the Palm colormap file. palm8bit_stdcolormap_file String, used by the netpbm2 and netpbm image parsers to get the location of the Palm colormap file. palmtopnm_program String, used by the netpbm2 image parser, giving the location of the palmtopnm program. Defaults to palmtopnm. pgmtopbm_program String, used by the netpbm2 image parser, giving the location of the pgmtopbm program. Defaults to pgmtopbm. pluckerdir See option --pluckerdir. pngtopnm_program String, used by the netpbm2 image parser, giving the location of the pngtopnm program. Defaults to pngtopnm. pnmcut_program String, used by the netpbm2 image parser, giving the location of the pnmcut program. Defaults to pnmcut. pnmdepth_program String, used by the netpbm2 image parser, giving the location of the pnmdepth program. Defaults to pnmdepth. pnmfile_program String, used by the netpbm2 image parser, giving the location of the pnmfile program. Defaults to pnmfile. pnmscale_program String, used by the netpbm2 image parser, giving the location of the pnmscale program. Defaults to pnmscale. ppmquant_program String, used by the netpbm2 image parser, giving the location of the pnmquant program. Defaults to pnmquant. ppmtoTbmp_program String, used by various image parsers, giving the location of either the ppmtoTbmp program (in various deprecated image parsers), or in netpbm2, the pnmtopalm program. In netpbm2, defaults to pnmtopalm. ppmtopgm_program String, used by the netpbm2 image parser, giving the location of the ppmtopgm program. Defaults to ppmtopgm. referrer See option --referrer. retrieval_timeout Integer, used to attempt to set a timeout in seconds on all retrievals. Will not affect timeouts on Java version of plucker-build. small_icon Filename of file containing a Palm icon to use as the small icon for the document, if the launchable bit is set. Defaults to a built-in icon. status_file See option --status-file. status_line_length Integer, specifying, in characters, the length of status lines output by the distiller. Defaults to 60. If a line is too long, some of the characters in the center are elided. tbmp_compression Boolean, used by the windows image parser to indicate whether or not to use Palm compression on images. Defaults to true. tbmp_compression_type Apparently also boolean, used by the windows image parser to indicate whether or not to use Palm compression on images. Defaults to true. The difference between this parameter and tbmp_compression is not known. title_md See option --title. try_reduce_bpp Boolean, controls whether the image parser will attempt to scale a large picture to fit by reducing the number of bits-per-pixel of the image. Only valid for netpbm2, imagemagick2, pil2, java, and windows image parsers. Defaults to off. try_reduce_bpp has precedence over try_reduce_dimension or auto_scale_image. try_reduce_dimension Boolean, controls whether the image parser will attempt to scale a large picture to fit by reducing the size of the image. Only valid for netpbm2, imagemagick2, pil2, java, and windows parser. use_cache See option --update-cache. Misleadingly named. user_agent See option --user-agent. verbosity See option --verbosity. zlib_compression Specifies that zlib compression should be used. Deprecated in favor of compression.
SEE ALSO
The Plucker User's Guide, at http://plkr.org/docs/.
BUGS
Report bugs using Debian BTs and the reportbug tool, or directly upstream to http://bugs.plkr.org/ or <plucker-bugs@rubberchicken.org>
AUTHORS
Holger Duerer, <holly@starship.python.net>, and Bill Janssen, <bill@janssen.org> Plucker 1.2 - http://plkr.org/ PLUCKER-BUILD(1)