oracular (7) podget.7.gz

Provided by: podget_0.9.3-1_all bug

NAME

       Podget - Simple tool to automate downloading of podcasts.

SYNOPSIS

       podget <options>

DESCRIPTION

       Podget is a simple podcast aggregator/downloader optimized for scheduled background jobs (i.e. cron).

       It features support for:
       - Downloading podcasts from RSS and ATOM XML feeds.
       - For sorting the files into folders and categories.
       - For importing URLs from iTunes PCAST files and OPML lists.
       - Automatic M3U & ASX playlist creation.
       - Cleanup of old files.
       - Automatic UTF-16 conversion for feeds hosted on MS Windows Servers.

       Podget  works by extracting the <enclosure> tags from the feed then downloading the specified URL.  There
       is  one  exception  when  Podget  will  ignore  <enclosure>  tags  and  that  is  when  they  are  within
       <podcast:liveItem>  tags  because  Podget is an aggregator and not a player so has not been optimized for
       live content.

OPTIONS

       -c <FILE> | --config <FILE>
              Name of configuration file.

       --create-config <FILE>
              Create configuration file and exit.

       -C | --cleanup
              Skip downloading and only run cleanup loop.

       --cleanup_days <NUMBER>
              Cleanup files older than <NUMBER> days.

       --cleanup_simulate
              Simulate cleanup loop to see what files would be deleted.

       -d <DIRECTORY> | --dir_config <DIRECTORY>
              Directory that configuration files are stored in.

       --dir_session <DIRECTORY>
              Directory that session files are stored in.

       -f | --force
              Force download of items from each feed even if they've already been downloaded.

       -h | --help
              Display condensed help dialog.

       -l <DIRECTORY> | --library <DIRECTORY>
              Directory to store downloaded files in.

       -n | --no-playlist
              Do not create M3U playlist of new items.

       -p | --playlist-asx
              In addition to M3U playlists, create ASX playlists.

       --playlist-per-podcast
              Create a playlist of new items for each podcast feed.

       -r <COUNT> | --recent <COUNT>
              Download only the <COUNT> newest items from each feed.

       --serverlist <FILE>
              Use <FILE> as serverlist instead of default.

       -s | --silent
              Run silently (for cron jobs).

       -v     Set verbosity to level 1.

       -vv    Set verbosity to level 2.

       -vvv   Set verbosity to level 3.

       -vvvv  Set verbosity to level 4.

       --verbosity <LEVEL>
              Set verbosity level (0-4).

       -V | --version
              Display version.

       OPML List Options:

              --import_opml <FILE or URL>
                     Import servers from OPML file or HTTP/FTP URL.

              --export_opml <FILE>
                     Export serverlist to OPML file.

       PCAST List Options:

              --import_pcast <FILE or URL>
                     Import server from iTunes PCAST file or HTTP/FTP URL.

CONFIGURATION FILES

       By default, Podget relies on two configuration files.

       podgetrc
              This is a file with most options for how Podget should run.

              If it is required to run  podget  with  different  options  for  certain  feeds,  then  additional
              configuration  files  can be created and used with the --config or -c option.  When this option is
              run with a new filename that does not exist yet, the file is created with default options that can
              then be customized as necessary.

       serverlist
              This is a file of all the feeds that Podget should monitor and download from.

              If  you need to separate your feeds into multiple lists, then additional files can be created with
              the --serverlist option.  When this option is run with a new filename that does not exist yet, the
              file is created with a default list of a single feed.  Whenever a new list is created, Podget will
              download a single item from the single feed included by  default  to  verify  that  everything  is
              working.

              For  a  description  of  the  options  available  for  this  file, please refer to the SERVER LIST
              CONFIGURATION section of this document.

   USER CONFIGURATION DIRECTORY
       The first time a user runs podget, it will create a configuration directory.  In this directory, it  will
       install the default configuration files.

       Where  this  configuration directory is automatically placed is dependent upon the version of Podget that
       you used when you first ran it.

       For version 0.8.10 and before:
              $HOME/.podget

       For later versions:
              If $XDG_CONFIG_HOME is set then it will be placed in:  $XDG_CONFIG_HOME/podget
              IF unset, then it will be placed in: $HOME/.config/podget

       If a user wants to clean up their $HOME directory by moving their  existing  configuration  directory  to
       either  of the new locations, it can be done but it is necessary to remember to remove the leading period
       so it is no longer a hidden directory.
              Example:  mv $HOME/.podget $HOME/.config/podget

       These locations can be overridden by the use of the --dir_config or -d option when you run podget.

   WHICH CONFIGURATION DIRECTORY IS USED
       Since there are at least three possible locations for the configuration directory then it is necessary to
       know  which  one podget will use.  To keep things simple, Podget uses the first one it finds and tests in
       the following order:

         1.  $HOME/.podget
         2.  $XDG_CONFIG_HOME/podget
         3.  $HOME/.config/podget

       This location testing is skipped by the use of the --dir_config or -d option.

   AUTOMATIC CLEANUP
       You can enable automatic cleanup with every run by configuring it in your podgetrc file. Simply  set  the
       following options:

         # Autocleanup.
         # 0 == disabled
         # 1 == delete any old content
         cleanup=1

         # Number of days to keep files.   Cleanup will remove anything
         # older than this.
         cleanup_days=7

       However,  some  people  prefer  to run cleanup as a separate cron session. To do that, set the options in
       podgetrc to:

         # Autocleanup.
         # 0 == disabled
         # 1 == delete any old content
         cleanup=0

         # Number of days to keep files.   Cleanup will remove anything
         # older than this.
         cleanup_days=7

       Then add something similar to this example to your crontab:

         # Once a week on Sunday at 04:07AM
         07 04 * * Sun /usr/bin/podget -C

   MULTIPLE CONCURRENT SESSIONS
       Podget checks for sessions using the same core configuration file that may already  be  running  when  it
       starts  and  exits  if any are found.  This insures that any long running sessions are not interrupted by
       new ones.

       If you have feeds that require distinct configurations, then you can enable them to run simultaneously by
       using  separate  configuration  files for each.  Then if you have sufficient bandwidth, you can call them
       all at the same time.

       Example Crontab configuration:

         00 02 * * * /usr/bin/podget -c podgetrc-group1
         00 02 * * * /usr/bin/podget -c podgetrc-group2

   SEQUENTIAL SESSIONS
       Sometimes, you have feed lists that use the same configuration but you wish to keep separate.  There  are
       two ways to handle this.

       First, run then separately from crontab with sufficient time in between so they don't interfere with each
       other.

         00 02 * * * /usr/bin/podget --serverlist RSS-Feeds
         00 03 * * * /usr/bin/podget --serverlist ATOM-Feeds

       The second option is to place them into a shell script  so  they  are  called  sequentially  and  do  not
       interfere with each other and then add it to your crontab.

         #!/usr/bin/env bash
         /usr/bin/podget --serverlist RSS-Feeds
         /usr/bin/podget --serverlist ATOM-Feeds

   ENABLING DEBUG OUTPUT
       Debug output can be enabled in two ways.

       The  first  way is by uncommenting the DEBUG option in your podgetrc and setting it to '1'.  However this
       way will not enable DEBUG until just over 1400 lines of script have run and  when   podgetrc  finally  is
       read.  This is sufficient for most issues.

       The second way is from the command-line and enables debug as early as possible.

       Simply execute podget like so:

         $ DEBUG=1 podget -vvvv

       You  can enable other options as well if you need to but for debugging purposes, it is highly recommended
       that you enabled as much verbosity as possible.

   SERVER LIST CONFIGURATION
       By default, Podget uses serverlist for the default list of servers to contact. However you can  configure
       the name with the config_serverlist variable in your podgetrc file.

       Feeds are listed one per line in the serverlist file.

       Default format with category and name:
              <url> <category> <name>

       Alternate Formats:
       1. With a category but no name.
              <url> <category>
       2. With a name but no category (2 ways).
              <url> No_Category <name>
              <url> . <name>
       3. With neither a category or name.
              <url>

       1. URL Rules:
              A. Any spaces in the URL need to be converted to %20
       2. Category Rules:
              A. Must be one word without spaces.
              B. You may use underscores and dashes.
              C. You can insert date substitutions.
                     %YY%  ==  Year
                     %MM%  ==  Month
                     %DD%  ==  Day
              D. Category disabling:
                     - With a name, the category must either be a single period (.) or 'No_Category'.
                     - If the name is blank, the category can also be blank.
       3. Name Rules:
              A.  If  you are creating ASX playlists, make sure the feed name does not have any spaces in it and
              the filename cannot be blank.
              B. You can leave the feed name blank, and files will be saved in the category directory.
              C. Names with spaces are only compatible with filesystems that allow for spaces in filenames.  For
              example,  spaces  in  feed names are OK for feeds saved to Linux ext partitions but are not OK for
              those saved to Microsoft FAT partitions.
              D. Feed names can be disabled by leaving them blank.
       4. Disable the downloading of any feed by commenting it out with a leading #.

       Example:
        http://www.lugradio.org/episodes.rss Linux LUG Radio

       Example with date substitution in the category and a blank feed name:
        http://downloads.bbc.co.uk/rmhttp/downloadtrial/worldservice/summary/rss.xml News-%YY%-%MM%-%DD%

       Example of two ways to do a feed with authentication:
        http://somesite.com/feed.rss CATEGORY Feed Name USER:username PASS:password
        http://username:password@somesite.com/feed.rss CATEGORY Feed Name

              NOTE: The second method will fail if a colon (:) is  part  of  the  username  or  password.   Both
              methods will fail if a space is part of the username or password.

       Common Options:

       OPT_CONTENT_DISPOSITION
              Attempt to get filename from the Content-Disposition tag that is part of wget --server-response.

       OPT_DISPOSITION_FAIL
              This  option  works  in conjunction with OPT_CONTENT_DISPOSITION by removing any URLs that fail to
              receive a filename from the COMPLETED log.  This allows them to be automatically retried the  next
              time  a session runs.  If this option is added to a feed that has already been downloaded then the
              user will need to remove the URLs for the problematic files from the COMPLETED  log  manually.  On
              one  feed  this  allowed for the improvement of the number of filename problems from approximately
              15% to under 2% over the course of 6 sessions.  Those sessions can occur sequentially on  one  day
              or as part of your established cron rotation.

       OPT_FEED_ORDER_ASCENDING
              By  default,  Podget assumes that items in a feed will be listed from newest to oldest (descending
              order).  This option will modify Podget's handling of the feed for  those  that  are  listed  from
              oldest  to  newest.   This  option will not have any noticeable effect for feeds where you want to
              download every item.  It will have an effect for new feeds when combined with the --recent [COUNT]
              option.

       OPT_FEED_PLAYLIST_NEWFIRST
              Most  playlist  options  create  lists  of  just  the new items that are downloaded in the current
              session.  This option creates or updates a full playlist for all items available for a feed sorted
              from newest to oldest based on the modification date/time of the file.

       OPT_FEED_PLAYLIST_OLDFIRST
              Same as OPT_FEED_PLAYLIST_NEWFIRST except playlist is ordered from oldest to newest.

       OPT_FILENAME_LOCATION
              Some  feeds  do  not  have  the detailed filename listed in the FEED but rather rename the file on
              redirection.  This option addresses that issue by attempting to grab the filename  from  the  last
              'Location:' tag in the output of 'wget --server-response'.

       OPT_FILENAME_RENAME_MDATE
              For  feeds  that  use  a  singular  filename  for  each item that is identified by a long somewhat
              incomprehensible string in the URL.  These feeds were previously  fixed  with  FILENAME_FORMATFIX4
              which  would  append  the string to the common filename to produce unique filenames for each item.
              However this produced filenames that were not very easy  to  understand.   This  option  gives  us
              another  method  for dealing with these common filenames.  This appends the date of the files last
              change (modification date) as a prefix to the filename in the format  of  YYYYMMDD_HHhMMm_<common-
              part>.   This  makes  the  filenames  sortable  and gives the user something that makes a moderate
              amount of sense.  Does not work for all feeds, for some feeds the last modification time for  each
              file  is the time of download.  Which may be acceptable in some situations but can cause confusion
              when downloading more than one item at a time from a feed.

       OPT_WGET_DEFUSERAGENT
              Configure Wget to use it's default user-agent (normally formatted similar to "Wget/1.21.2") and to
              not  use  either  Podget's default user-agent ("Podget") or a custom agent set in WGET_BASEOPTS in
              podgetrc.

       OPT_NO_CERT_CHECK
              Disable wget SSL certificate verification.  This is common used for feeds  that  are  using  self-
              signed certificates.

       OPT_PREFER_IPv4 or OPT_PREFER_IPv6
              Configure  wget  so  that  when  a  DNS  lookup gives a choice of several addresses that it should
              connect to the specified family first.

       Examples:
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_PREFER_IPv4
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_PREFER_IPv6
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_WGET_DEFUSERAGENT
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_NO_CERT_CHECK
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_CONTENT_DISPOSITION
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_CONTENT_DISPOSITION OPT_DISPOSITION_FAIL
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FILENAME_LOCATION
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FILENAME_RENAME_MDATE
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FILENAME_LOCATION OPT_FILENAME_RENAME_MDATE
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FEED_ORDER_ASCENDING
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FEED_PLAYLIST_NEWFIRST
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FEED_PLAYLIST_OLDFIRST

       RSS Feed Options:
              There are three options for RSS Feeds that are not supported for ATOM feeds.

              The first two are related with the renaming the downloaded files with the contents of the  <TITLE>
              tag from the HTML and the third is to expand what tags Podget gets content from.

       OPT_FILENAME_RENAME_TITLETAG
              This  first  version  is for handling feeds that place the <TITLE> tag before the <ENCLOSURE> tag.
              The majority of tested feeds that use <TITLE> tags follow this order.

       OPT_FILENAME_RENAME_REVTITLETAG
              The second version is for handling feeds that have the  <ENCLOSURE>  tag  first  followed  by  the
              <TITLE> tag.

       OPT_RSS_MEDIACONTENT
              This  third option will enable Podget to download content from <MEDIA:CONTENT> tags in addition to
              <ENCLOSURE> tags.

       Examples:
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FILENAME_RENAME_TITLETAG
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FILENAME_RENAME_TITLETAG OPT_FILENAME_RENAME_MDATE
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_FILENAME_RENAME_REVTITLETAG
        http://somesite.com/feed.rss CATEGORY Feed Name OPT_RSS_MEDIACONTENT

       To determine if the feed uses <TITLE> tags and in which order, run the following with  the  URL  for  the
       feed:

               wget -O - http://somesite.com/feed.rss | sed -n -e :a -e 's/.*<enclosure.*url\s*=\s*"\([^"]+\)".*/URL 1/Ip' -e t -e "s/.*<enclosure.*url\s*'=\s*\([^i]\+\)'.*/URL \1/Ip" -e t -e 's/.*<title>\(.*\)<[/]title>.*$/TITLE 1/Ip' -e t -e '/\(<enclosure\|<title>\).*/I{N;s/ *0 /;T;ba}'

       This  will produce a list of lines that start with either TITLE or URL.  The  URL is from the <ENCLOSURE>
       tag and the TITLE is obviously from the <TITLE> tag.  On many feeds the first thing you will notice is  a
       few  uses of the <TITLE> tag before the first URL is specified.  In that case, Podget uses the last TITLE
       found, so the earlier ones are discard.  The important part is when we get to the first URL,  from  there
       we  need to determine if the title for that item came before or after the URL.  If it comes first then we
       use   OPT_FILENAME_RENAME_TITLETAG   for   it.    If   the   title   comes    second    then    we    use
       OPT_FILENAME_RENAME_REVTITLETAG.

       On  some feeds, the downloaded filename will not have anything identifiable to determine which TITLE goes
       with it.  In those cases it may be necessary to download a few items and  listen  to  them  to  determine
       which order they use.

       On  some  feeds,  it  will be discovered that the downloaded filename and the TITLE are very similar.  In
       those cases, it is left to the user to determine which they prefer.

       On some feeds, the TITLE will have very little to specify when it was recorded and it may  be  useful  to
       use the OPT_FILENAME_RENAME_MDATE option to add a date tag to each filename as it is converted.

       And  on  some  feeds, there will be a complete absence of TITLE lines.  Those feeds do not use the tag so
       using either option will not produce any changes.

       Atom Feed Options:
              The following options are available for advanced handling of Atom feeds.

       ATOM_FILTER_SIMPLE
              This option will enable filtering for just audio or video files from a feed.

       ATOM_FILTER_TYPE="type"
              This option allows more detailed filtering of the variety of types available.  This can limit  the
              files   downloaded   to   one   type   (example:   "audio/mpeg")  or  to  a  few  types  (example:
              "(audio|video)/.*" for all audio and video types, OR "audio/.*" for all audio types).

       ATOM_FILTER_LANG="language"
              If an Atom feed supports multiple languages for enclosures, then you can use this option to filter
              to  only  those  you  desire.   You  can limit to one language (example: "en" for just English) or
              combine several supported languages to get them all (example: "(en|es|fr)" to  download  files  in
              English, Spanish and French.  How the languages are defined may vary from feed to feed.

       Note:   If  you do not enable any of the ATOM_FILTER options on a feed with multiple enclosures per item,
       when you run podget it will tell you the count per type or language to help  you  decide  if  you  should
       enable the filters to reduce the number of files to be downloaded.

       Examples:
        http://somesite.com/feed CATEGORY Feed Name ATOM_FILTER_SIMPLE
        http://somesite.com/feed CATEGORY Feed Name ATOM_FILTER_TYPE="audio/mpeg"
        http://somesite.com/feed CATEGORY Feed Name ATOM_FILTER_TYPE="(audio|video)/.*"
        http://somesite.com/feed CATEGORY Feed Name ATOM_FILTER_LANG="en"
        http://somesite.com/feed CATEGORY Feed Name ATOM_FILTER_LANG="(en|es|fr)"
        http://somesite.com/feed CATEGORY Feed Name ATOM_FILTER_TYPE="audio/mpeg" ATOM_FILTER_LANG="en"

   HANDLING UTF-16 FEEDS
       Some servers provide their feeds in UTF-16 format rather than the more common UTF-8.

       To automatically convert these files, create a secondary serverlist in your configuration directory:

               serverlist.utf16

       Remember  to  change the name of the serverlist to match what you set it to with config_serverlist if you
       changed it.

EXAMPLE CRON JOB

       Once podget is running correctly, it's most useful if you run it from a cron job so that the new episodes
       are available to play or load onto a portable player and you don't have to wait for them to download.

       To edit your crontab, do:

         $ crontab -e

       Then add one line similar to this example:

         15 04 * * * /usr/bin/podget -s

       This will run podget at 4:15 AM every day.

       In  some  cases,  you  might  need to add a few directories to your PATH variable so that Podget can find
       everything it needs.

       Then the job might look like:

         15 04 * * * PATH=/opt/local/bin:/usr/local/bin:$PATH /usr/bin/podget -s

AUTHORS

       Dave Vehrs

                                                10 February 2023                                       podget(7)