lunar (1) urifind.1p.gz

Provided by: liburi-find-perl_20160806-3_all bug

NAME

       urifind - find URIs in a document and dump them to STDOUT.

SYNOPSIS

           $ urifind file

DESCRIPTION

       urifind is a simple script that finds URIs in one or more files (using "URI::Find"), and
       outputs them to to STDOUT.  That's it.

       To find all the URIs in file1, use:

           $ urifind file1

       To find the URIs in multiple files, simply list them as arguments:

           $ urifind file1 file2 file3

       urifind will read from "STDIN" if no files are given or if a filename of "-" is specified:

           $ wget http://www.boston.com/ -O - | urifind

       When multiple files are listed, urifind prefixes each found URI with the file from which
       it came:

           $ urifind file1 file2
           file1: http://www.boston.com/index.html
           file2: http://use.perl.org/

       This can be turned on for single files with the "-p" ("prefix") switch:

           $urifind -p file3
           file1: http://fsck.com/rt/

       It can also be turned off for multiple files with the "-n" ("no prefix") switch:

           $ urifind -n file1 file2
           http://www.boston.com/index.html
           http://use.perl.org/

       By default, URIs will be displayed in the order found; to sort them ascii-betically, use
       the "-s" ("sort") option.  To reverse sort them, use the "-r" ("reverse") flag ("-r"
       implies "-s").

           $ urifind -s file1 file2
           http://use.perl.org/
           http://www.boston.com/index.html
           mailto:webmaster@boston.com

           $ urifind -r file1 file2
           mailto:webmaster@boston.com
           http://www.boston.com/index.html
           http://use.perl.org/

       Finally, urifind supports limiting the returned URIs by scheme or by arbitrary pattern,
       using the "-S" option (for schemes) and the "-P" option.  Both "-S" and "-P" can be
       specified multiple times:

           $ urifind -S mailto file1
           mailto:webmaster@boston.com

           $ urifind -S mailto -S http file1
           mailto:webmaster@boston.com
           http://www.boston.com/index.html

       "-P" takes an arbitrary Perl regex.  It might need to be protected from the shell:

           $ urifind -P 's?html?' file1
           http://www.boston.com/index.html

           $ urifind -P '\.org\b' -S http file4
           http://www.gnu.org/software/wget/wget.html

       Add a "-d" to have urifind dump the refexen generated from "-S" and "-P" to "STDERR".
       "-D" does the same but exits immediately:

           $ urifind -P '\.org\b' -S http -D
           $scheme = '^(\bhttp\b):'
           @pats = ('^(\bhttp\b):', '\.org\b')

       To remove duplicates from the results, use the "-u" ("unique") switch.

OPTION SUMMARY

       -s  Sort results.

       -r  Reverse sort results (implies -s).

       -u  Return unique results only.

       -n  Don't include filename in output.

       -p  Include filename in output (0 by default, but 1 if multiple files are included on the
           command line).

       -P $re
           Print only lines matching regex '$re' (may be specified multiple times).

       -S $scheme
           Only this scheme (may be specified multiple times).

       -h  Help summary.

       -v  Display version and exit.

       -d  Dump compiled regexes for "-S" and "-P" to "STDERR".

       -D  Same as "-d", but exit after dumping.

AUTHOR

       darren chamberlain <darren@cpan.org>

       (C) 2003 darren chamberlain

       This library is free software; you may distribute it and/or modify it under the same terms
       as Perl itself.

SEE ALSO

       URI::Find