Ubuntu Manpage: Net::OpenSSH::Parallel - Run SSH jobs in parallel

Provided by: libnet-openssh-parallel-perl_0.14-1_all

NAME

       Net::OpenSSH::Parallel - Run SSH jobs in parallel

SYNOPSIS

         use Net::OpenSSH::Parallel;

         my $pssh = Net::OpenSSH::Parallel->new();
         $pssh->add_host($_) for @hosts;

         $pssh->push('*', scp_put => '/local/file/path', '/remote/file/path');
         $pssh->push('*', command => 'gurummm',
                     '/remote/file/path', '/tmp/output');
         $pssh->push($special_host, command => 'prumprum', '/tmp/output');
         $pssh->push('*', scp_get => '/tmp/output', 'logs/%HOST%/output');

         $pssh->run;

DESCRIPTION

       Run this here, that there, etc.

       "Net::OpenSSH::Parallel" is an scheduler that can run commands in parallel in a set of hosts through SSH.
       It tries to find a compromise between being simple to use, efficient and covering a good part of the
       problem space of parallel process execution via SSH.

       Obviously, it is build on top of Net::OpenSSH!

       Common usage of the module is as follows:

       •   create a "Net::OpenSSH::Parallel" object

       •   register the hosts where you want to run commands with the "add_host" method

       •   queue the actions you want to run (commands, file copy operations, etc.) using the "push" method.

       •   call the "run" method and let the parallel scheduler take care of everything!

   Labeling hosts
       Every  host  is identified by an unique label that is given when the host is registered into the parallel
       scheduler. Usually, the host name is used also as the label, but this is not required by the module.

       The rationale behind using labels is that  a  hostname  does  not  necessarily  identify  unique  "remote
       processors"  (for  instance,  sometimes your logical "remote processors" may be user accounts distributed
       over a set of hosts: "foo1@bar1", "foo2@bar1", "foo3@bar2", ...; a  set  of  hosts  that  are  accessible
       behind an unique IP, listening in different ports; etc.)

   Selecting hosts
       Several  of  the  methods  of  this  module  (well,  currently,  just "push") accept a selector string to
       determine which of the registered hosts should be affected by the operation.

       For instance, in...

         $pssh->push('*', command => 'ls')

       the first argument is the selector. The one used here, "*", selects all the registered hosts.

       Other possible selectors are:

         'bar*'                # selects everything beginning by 'bar'
         'foo1,foo3,foo6'      # selects the hosts of the given names
         'bar*,foo1,foo3,foo6' # both
         '*doz*'               # everything containing 'doz'

       Note: I am still considering how the selector mini-language should be,  do  not  hesitate  to  send  your
       suggestions!

   Local resource usage
       When the number of hosts managed by the scheduler is too high, the local node can become overloaded.

       Roughly,  every  SSH  connection  requires  two  local "ssh" processes (one to run the SSH connection and
       another one to launch the remote command) that results in around 5MB of RAM usage per host.

       CPU usage varies greatly depending on the tasks carried out. The most expensive are  short  remote  tasks
       (because  of  the local process creation and destruction overhead) and tasks that transfer big amounts of
       data through SSH (because of the encryption going on).

       In practice, CPU usage does not matter too much (mostly because the OS would be able  to  manage  it  but
       also  because  there  is  not too many things we can do to reduce it) and usually it is RAM about what we
       should be more concerned.

       The module accepts two parameters to limit resource usage:

       •   "workers"

           is the maximum number of remote commands that can be running concurrently.

       •   "connections"

           is the maximum number of SSH connections that can be active concurrently.

       In practice, limiting the maximum number of connections indirectly limits RAM usage and limiting the  the
       maximum number of workers indirectly limits CPU usage.

       The  module  requires  the  maximum number of connections to be at least equal or bigger than the maximum
       number of workers, and it is recommended that "maximum_connections >= 2 * maximum_workers" (otherwise the
       scheduler will not be able to reuse connections efficiently).

       You will have to experiment to find out which combinations give  the  best  results  in  your  particular
       scenarios.

       Also, for small sets of hosts you can just let these parameters unlimited.

   Variable expansion
       This  module  activates  Net::OpenSSH  variable  expansion by default. That way, it is possible to easily
       customize the actions executed on every host in base to some of its properties.

       For instance:

         $pssh->push('*', scp_get => "/var/log/messages", "messages.%HOST%");

       copies the log files appending the name of the remote hosts to the local file names.

       The variables "HOST", "USER", "PORT" and "LABEL" are predefined.

   Error handling
       When something goes wrong (for instance, some host is unreachable, some  connection  dies,  some  command
       fails, etc.) the module can handle the error in several predefined ways as follows:

       Error policies

       To  set  the  error handling police, "new", "add_host" and "push" methods support and optional "on_error"
       argument   that   can   take   the   following   values   (these    constants    are    available    from
       Net::OpenSSH::Parallel::Constants):

       OSSH_ON_ERROR_IGNORE
           Ignores the error and continues executing tasks in the host queue as it had never happened.

       OSSH_ON_ERROR_ABORT
           Aborts  the processing on the corresponding host. The error will be propagated to other hosts joining
           it at any later point once the join is reached.

           In other words, this police aborts the queued jobs for this host and any other that has a  dependency
           on it.

       OSSH_ON_ERROR_DONE
           Similar to "OSSH_ON_ERROR_ABORT" but will not propagate errors to other hosts via joins.

       OSSH_ON_ERROR_ABORT_ALL
           Causes  all the host queues to be aborted as soon as possible (and that usually means after currently
           running actions end).

       OSSH_ON_ERROR_REPEAT
           The module will try to perform the current task again and again until it succeeds.  This  police  can
           lead  to  an  infinite loop and so its direct usage is discouraged (but see the following point about
           setting the policy dynamically).

       The default policy is "OSSH_ON_ERROR_ABORT".

       Setting the policy dynamically

       When a subroutine reference is used as the  policy  instead  of  the  any  of  the  constants  previously
       described, the given subroutine will be called on error conditions as follows:

         $on_error->($pssh, $label, $error, $task)

       $pssh  is  a reference to the "Net::OpenSSH::Parallel" object, $label is the label associated to the host
       where the error happened. $error is the error type as defined  in  Net::OpenSSH::Parallel::Constants  and
       $task is a reference to the task that was being carried out.

       The  return  value  of the subroutine must be one of the described constants and the corresponding policy
       will be applied.

       Retrying connection errors

       If the module fails when trying to establish a new SSH connection or when  an  existing  connection  dies
       unexpectedly, the option "reconnections" can be used to instruct the module to retry the connection until
       it succeeds or the given maximum is reached.

       "reconnections" is accepted by both the "new" and "add_host" methods.

       Example:

         $pssh->add_host('foo', reconnections => 3);

       Note that the reconnections maximum is not per host but per queued task.

   API
       These are the available methods:

       $pssh = Net::OpenSSH::Parallel->new(%opts)
           creates a new object.

           The accepted options are:

           workers => $maximum_workers
               sets  the  maximum  number of operations that can be carried out in parallel (see "Local resource
               usage").

           connections => $maximum_connections
               sets the maximum number of SSH connections that can be  established  simultaneously  (see  "Local
               resource usage").

               $maximum_connections must be equal or bigger than $maximum_workers

           reconnections => $maximum_reconnections
               when  connecting  to  some  host  fails,  this  argument  tells  the module the maximum number of
               additional connection attempts that it should perform before giving  up.  The  default  value  is
               zero.

               See also "Retrying connection errors".

           on_error => $policy
               Sets the error handling policy (see "Error handling").

       $pssh->add_host($label, %opts)
       $pssh->add_host($label, $host, %opts)
           registers a new host into the $pssh object.

           $label is the name used to refer to the registered host afterwards.

           When the hostname argument is omitted, the label is used also as the hostname.

           The accepted options are:

           on_error => $policy
               Sets the error handling policy (see "Error handling").

           reconnections => $maximum_reconnections
               See "Retrying connection errors".

           Any additional option will be passed verbatim to the Net::OpenSSH constructor later. For instance:

             $pssh->add_host($host, user => $user, password => $password);

       $pssh->push($selector, $action, \%opts, @action_args)
       $pssh->push($selector, $action, @action_args)
           pushes a new action into the queues selected by $selector.

           The supported actions are:

           command => @cmd
               queue the given shell command on the selected hosts.

               Example:

                 $self->push('*', 'command'
                             { stdout_fh => $find_fh, stderr_to_stdout => 1 },
                             'find', '/my/dir');

           scp_get => @remote, $local
           scp_put => @local, $remote
               These methods queue a "scp" remote file copy operation in the selected hosts.

           rsync_get => @remote, $local
           rsync_put => @local, $remote
               These methods queue an rsync remote file copy operation in the selected hosts.

           sub => sub { ... }, @extra_args
           sub { ... }, @extra_args
               Queues a call to a perl subroutine that will be executed locally.

               Note  that  subroutines  are executed synchronously in the same process, so no other task will be
               scheduled while they are running.

               The sub is called as

                 $sub->($pssh, $label, @extra_args)

               where $pssh is the current Net::OpenSSH::Parallel object.

           parsub => sub { ... }, @extra_args
               Queues a call to a perl subroutine that will be executed locally on a forked process.

               The sub is called as

                 $sub->($label, $ssh, @extra_args)

               Where $ssh is an Net::OpenSSH object that can be used to interact with the remote machine.

               Note that the interface is different to that of the "sub" action.

               An example of usage:

                 sub sudo_install {
                     my ($label, $ssh, @pkgs) = @_;
                     my ($pty) = $ssh->open2pty('sudo', 'apt-get', 'install', @pkgs);
                     my $expect = Expect->init($pty);
                     $expect->raw_pty(1);
                     $expect->expect($timeout, ":");
                     $expect->send("$passwd\n");
                     $expect->expect($timeout, "\n");
                     $expect->raw_pty(0);
                     while(<$expect>) { print };
                     close $expect;
                 }

                 $pssh->push('*', parsub => \&sudo_install, 'scummvm');

               If the subroutine dies or calls "_exit" with a non zero return code, the error handling code will
               be triggered (see "Error handling").

               The "parsub" action accepts the additional option "no_ssh" indicating that the $ssh object is not
               going to be used. For instance:

                 $pssh->push('*', parsub => { no_ssh => 1 },
                             sub {
                                   my $label = shift;
                                   { exec "gzip", "/tmp/file-$label" };
                                   die "exec failed: $!";
                             });

               That can make the script faster when the maximum number of simultaneous connections  is  limited.
               See "Local resource usage".

           join => $selector
               Joins allow to synchronize jobs between different servers.

               For instance:

                 $ssh->push('server_B', scp_get => '/tmp/foo', 'foo');
                 $ssh->push('server_A', join => 'server_B');
                 $ssh->push('server_A', scp_put => 'foo', '/tmp/foo');

               The  join  makes server_A to wait for the "scp_get" operation queued in server_B to finish before
               proceeding with the "scp_put".

               In general the join will make the selected servers wait  for  any  task  queued  on  the  servers
               matched by $selector to finish before proceeding with the next queued tasks.

               One common usage is to synchronize all servers at some point:

                 $ssh->push('*', join => '*');

               By  default, errors are propagated at joins. For instance, in the example above, if the "scp_get"
               operation queued on server_B failed, it would abort any further operation queued on server_B  and
               any further operation queued after the join in server_A. See also "Error handling".

           here => $tag
               Push a tag in the stack that can be used as a target for goto operations.

           goto => $target
               Jumps forward until the given "here" tag is reached.

               Joins  to  other  hosts  queues  will be ignored, and joins from other queues to this one will be
               successfully fulfilled. For instance:

                 $pssh->add_host(A => ...);
                 $pssh->add_host(B => ...);
                 $pssh->push('*', cmd  => 'echo "hello from %HOST"');
                 $pssh->push('A', goto => 'there');
                 $pssh->push('A', join => 'B');                     # ignored by A on goto
                 $pssh->push('B', join => 'A');                     # fulfilled by A on goto
                 $pssh->push('*', cmd  => 'echo "hello from %HOST% again"');
                 $pssh->push('*', here => 'there');
                 $pssh->push('*', cmd  => 'echo "bye bye from %HOST%");

               Note that it is not possible to jump backwards.

               There is an special target "END" that can be used to jump to the end of the queue.

           stop
               Discards any additional operations queued. Any pending joins will be successfully fulfilled.

               It is equivalent to

                 $pssh->push('*', goto => 'END');

           connect
               Just ensures that connecting to the remote machine is possible without doing any other action.

           When given, %opts can contain the following options:

           on_error => $fail_mode
           on_error => sub { ... }
               See "Error handling".

           or_goto => $tag
               Supported for "command", "scp_get", "scp_put", "rsync_get" and  "rsync_put",  when  the  command,
               "scp" or "rsync" operation fails a "goto" to the given target is performed.

               For instance:

                 $pssh->all(command => { or_goto => 'no_file' },
                                       "test -f /etc/foo");
                 $pssh->all(scp_get => "/etc/foo", "/tmp/foo-%LABEL%");
                 $pssh->all(here    => "no_file");

               Failures related to SSH errors do not trigger the goto but the error handling code.

           timeout => $seconds
               not implemented yet!

           on_done => sub { ... }
               not implemented yet!

           Any other option will be passed to the corresponding Net::OpenSSH method (spawn, scp_put, etc.).

       $pssh->all($action => @args)
       $pssh->all($action => \%opts, @args)
           Shortcut for...

             $pssh->push('*', $action, \%opts, @args);

       $pssh->run
           Runs the queued operations.

           It returns a true value on success and false otherwise.

       $pssh->get_error($label)
           Returns the last error associated to the host of the given label.

       $pssh->get_errors
           In list context returns a list of pairs "$label => $error" for the failed queues.

           In scalar context returns the number of failed queues.

FAQ - Frequently Asked Questions

       Running remote commands with sudo
           Q: I need to run the remote commands with sudo that asks for a password. How can I do it?

           A: First read the answer given to a similar question on Net::OpenSSH FAQ.

           The  problem  is  that  Net::OpenSSH::Parallel methods do not support the <stdin_data> option, so you
           will have to use an external file.

             $pssh->push('*', cmd => { stdin_file => $passwd_file },
                              'sudo', '-Skp', '', '--', @cmd);

           One trick you can use if you only have one password is to use the "DATA" file handle:

             $pssh->push('*', cmd => { stdin_fh => \*DATA},
                         'sudo', '-Skp', '', '--', @cmd);
             ...
             # and at the end of your script
             __DATA__
             this-is-my-remote-password-for-sudo

           Or you can also use the "parsub" action:

             my %sudo_passwords = (host1 => "foo", ...);

             sub sudo {
               my ($label, $ssh, @cmd) = @_;
               $ssh->system({stdin_data => "$sudo_passwords{$label}\n"},
                            'sudo', '-Skp', '', '--', @cmd);
             }

             $pssh->push('*', parsub => \&sudo, @cmd);

TODO

       •   run N processes per host concurrently

           allow running more than one process per remote server concurrently

       •   delay before reconnect

           when connecting fails, do not try to reconnect immediately but after some predefined period

       •   rationalize debugging

           currently it is a mess

       •   add logging support

           log the operations performed in a given file

       •   stdio redirection

           add support for better handling of the Net::OpenSSH stdio redirection facilities

       •   configurable valid return codes

           Non zero exit code is not always an error.

BUGS AND SUPPORT

This module should be considered beta quality, everything seems to work but it may yet contain critical
bugs.

If you find any, report it via <http://rt.cpan.org> or by email (to sfandino@yahoo.com), please.

Feedback and comments are also welcome!

The 'sub' and 'parsub' features should be considered experimental and its API or behavior could be
changed in future versions of the module.

Reporting bugs
In order to report a bug, write a minimal program that triggers it and place the following line at the
beginning:

$Net::OpenSSH::Parallel::debug = -1;

Then, send me (via RT or email) the debugging output you get when you run it. Include also the source
code of the script, a description of what is going wrong and the details of your OS and the versions of
Perl, "Net::OpenSSH" and "Net::OpenSSH::Parallel" you are using.

Development version
The source code for this module is hosted at GitHub: <http://github.com/salva/p5-Net-OpenSSH-Parallel>.

Commercial support
Commercial support, professional services and custom software development around this module are
available through my current company. Drop me an email with a rough description of your requirements and
we will get back to you ASAP.

My wishlist
If you like this module and you are feeling generous, take a look at my Amazon Wish List:
<http://amzn.com/w/1WU1P6IR5QZ42>

Also consider contributing to the OpenSSH project this module builds upon:
<http://www.openssh.org/donations.html>.

COPYRIGHT AND LICENSE

       Copyright © 2009-2012, 2015 by Salvador Fandiño (sfandino@yahoo.com).

       This  library  is  free  software;  you can redistribute it and/or modify it under the same terms as Perl
       itself, either Perl version 5.10.0 or, at your  option,  any  later  version  of  Perl  5  you  may  have
       available.

perl v5.40.1                                       2025-09-06                        Net::OpenSSH::Parallel(3pm)