Ubuntu Manpage: mpire - mpire Documentation

Provided by: python-mpire-doc_2.10.2-3_all

NAME

       mpire - mpire Documentation

       MPIRE, short for MultiProcessing Is Really Easy, is a Python package for multiprocessing. MPIRE is faster
       in  most  scenarios,  packs  more  features,  and  is  generally  more  user-friendly  than  the  default
       multiprocessing package. It combines the convenient map like functions of multiprocessing.Pool  with  the
       benefits  of  using  copy-on-write  shared  objects of multiprocessing.Process, together with easy-to-use
       worker state, worker insights, worker init and exit functions, timeouts, and progress bar functionality.

FEATURES

       • Faster execution than other multiprocessing libraries. See benchmarks.

       • Intuitive, Pythonic syntax

       • Multiprocessing with map/map_unordered/imap/imap_unordered/apply/apply_async functions

       • Easy use of copy-on-write shared objects with a pool of workers (copy-on-write is  only  available  for
         start method fork, so it’s not supported on Windows)

       • Each  worker  can  have its own state and with convenient worker init and exit functionality this state
         can be easily manipulated (e.g., to load a memory-intensive model only once for each worker without the
         need of sending it through a queue)

       • Progress bar support using tqdm (rich and notebook widgets are supported)

       • Progress dashboard support

       • Worker insights to provide insight into your multiprocessing efficiency

       • Graceful and user-friendly exception handling

       • Timeouts, including for worker init and exit functions

       • Automatic task chunking for all available map functions to speed up processing  of  small  task  queues
         (including numpy arrays)

       • Adjustable maximum number of active tasks to avoid memory problems

       • Automatic restarting of workers after a specified number of tasks to reduce memory footprint

       • Nested pool of workers are allowed when setting the daemon option

       • Child processes can be pinned to specific or a range of CPUs

       • Optionally  utilizes  dill  as  serialization backend through multiprocess, enabling parallelizing more
         exotic objects, lambdas, and functions in iPython and Jupyter notebooks.

       MPIRE has been tested on Linux, macOS, and Windows. There are a few minor known caveats for  Windows  and
       macOS users, which can be found at Windows.

   Installation
       MPIRE builds are distributed through PyPi.

       MPIRE can be installed through pip:

          pip install mpire

       and is available through conda-forge:

          conda install -c conda-forge mpire

   Dependencies
       • Python >= 3.8

       Python packages (installed automatically when installing MPIRE):

       • tqdm

       • pygments

       • pywin32 (Windows only)

       • importlib_resources (Python < 3.9 only)

       NOTE:
          When  using MPIRE on Windows with conda, you might need to install pywin32 using conda install pywin32
          when encountering a DLL failed to load error.

   Dill
       For some functions or tasks it can  be  useful  to  not  rely  on  pickle,  but  on  some  more  powerful
       serialization  backend,  like  dill. dill isn’t installed by default as it has a BSD license, while MPIRE
       has an MIT license. If you want to use it, the license of MPIRE will change to a BSD license as well,  as
       required by the original BSD license. See the BSD license of multiprocess for more information.

       You can enable dill by executing:

          pip install mpire[dill]

       This will install multiprocess, which uses dill under the hood. You can enable the use of dill by setting
       use_dill=True in the mpire.WorkerPool constructor.

   Rich progress bars
       If you want to use rich progress bars, you have to install the dependencies for it manually:

          pip install rich

   Dashboard
       Optionally,  you  can install the dependencies for the MPIRE dashboard, which depends on Flask. Similarly
       as with dill, Flask has a BSD-license. Installing these dependencies will change the license of MPIRE  to
       BSD as well.  See the BSD license of Flask for more information.

       The  dashboard  allows  you  to  see progress information from a browser. This is convenient when running
       scripts in a notebook or screen, or want to share the  progress  information  with  others.  Install  the
       appropriate dependencies to enable this:

          pip install mpire[dashboard]

   Getting started
       Suppose  you  have a time consuming function that receives some input and returns its results. This could
       look like the following:

          import time

          def time_consuming_function(x):
              time.sleep(1)  # Simulate that this function takes long to complete
              return ...

          results = [time_consuming_function(x) for x in range(10)]

       Running this function takes about 10 seconds to complete.

       Functions like these are known as embarrassingly parallel problems, functions that require little  to  no
       effort  to turn into a parallel task. Parallelizing a simple function as this can be as easy as importing
       multiprocessing and using the multiprocessing.Pool class:

          from multiprocessing import Pool

          with Pool(processes=5) as pool:
              results = pool.map(time_consuming_function, range(10))

       We configured to have 5 workers, so we can handle 5 tasks in parallel. As a result,  this  function  will
       complete in about 2 seconds.

       MPIRE  can  be used almost as a drop-in replacement to multiprocessing. We use the mpire.WorkerPool class
       and call one of the available map functions:

          from mpire import WorkerPool

          with WorkerPool(n_jobs=5) as pool:
              results = pool.map(time_consuming_function, range(10))

       Similarly, this will complete in about 2 seconds. The differences in code are small: there’s no  need  to
       learn  a completely new multiprocessing syntax, if you’re used to vanilla multiprocessing. The additional
       available functionality, though, is what sets MPIRE apart.

   Progress bar
       Suppose we want to know the status of the current task: how many tasks are completed, how long before the
       work is ready?  It’s as simple as setting the progress_bar parameter to True:

          with WorkerPool(n_jobs=5) as pool:
              results = pool.map(time_consuming_function, range(10), progress_bar=True)

       And it will output a nicely formatted tqdm progress bar.

       MPIRE also offers a dashboard, for which you need to install additional dependencies. See  Dashboard  for
       more information.

   Shared objects
       If  you  have  one  or  more  objects  that you want to share between all workers you can make use of the
       copy-on-write shared_objects option of MPIRE. MPIRE will pass on these objects only once for each  worker
       without  copying/serialization.  Only  when  the  object  is altered in the worker function it will start
       copying it for that worker.

       NOTE:
          Copy-on-write is not available on Windows, as it requires the start method fork.

          def time_consuming_function(some_object, x):
              time.sleep(1)  # Simulate that this function takes long to complete
              return ...

          def main():
              some_object = ...
              with WorkerPool(n_jobs=5, shared_objects=some_object, start_method='fork') as pool:
                  results = pool.map(time_consuming_function, range(10), progress_bar=True)

       See Shared objects for more details.

   Worker initialization
       Need to initialize each worker before starting the work? Have a look at the worker_state and  worker_init
       functionality:

          def init(worker_state):
              # Load a big dataset or model and store it in a worker specific worker_state
              worker_state['dataset'] = ...
              worker_state['model'] = ...

          def task(worker_state, idx):
              # Let the model predict a specific instance of the dataset
              return worker_state['model'].predict(worker_state['dataset'][idx])

          with WorkerPool(n_jobs=5, use_worker_state=True) as pool:
              results = pool.map(task, range(10), worker_init=init)

       Similarly,  you  can  use  the  worker_exit  parameter  to  let  MPIRE  call a function whenever a worker
       terminates. You can even let this exit function return results, which can be obtained later on.  See  the
       Worker init and exit section for more information.

   Worker insights
       When  your  multiprocessing  setup isn’t performing as you want it to and you have no clue what’s causing
       it, there’s the worker insights functionality. This will give you some insight in your setup, but it will
       not profile the function you’re running (there are other libraries for that). Instead,  it  profiles  the
       worker  start up time, waiting time and working time. When worker init and exit functions are provided it
       will time those as well.

       Perhaps you’re sending a lot of data over the task queue, which makes the waiting time  go  up.  Whatever
       the   case,   you   can   enable   and   grab   the   insights   using   the   enable_insights  flag  and
       mpire.WorkerPool.get_insights() function, respectively:

          with WorkerPool(n_jobs=5, enable_insights=True) as pool:
              results = pool.map(time_consuming_function, range(10))
              insights = pool.get_insights()

       See Worker insights for a more detailed example and expected output.

   Usage
   WorkerPool
       This section describes how to setup a mpire.WorkerPool instance.

   Starting a WorkerPool
   Contents
       • Nested WorkerPools

       The mpire.WorkerPool class controls a pool of worker processes similarly to  a  multiprocessing.Pool.  It
       contains  all  the  map  like functions (with the addition of mpire.WorkerPool.map_unordered()), together
       with the apply and apply_async functions (see Apply family).

       An mpire.WorkerPool can be started in two different ways. The first and recommended way to do so is using
       a context manager:

          from mpire import WorkerPool

          # Start a pool of 4 workers
          with WorkerPool(n_jobs=4) as pool:
              # Do some processing here
              pass

       The with statement takes care of properly joining/terminating the  spawned  worker  processes  after  the
       block has ended.

       The other way is to do it manually:

          # Start a pool of 4 workers
          pool = WorkerPool(n_jobs=4)

          # Do some processing here
          pass

          # Only needed when keep_alive=True:
          # Clean up pool (this will block until all processing has completed)
          pool.stop_and_join()  # or use pool.join() which is an alias of stop_and_join()

          # In the case you want to kill the processes, even though they are still busy
          pool.terminate()

       When  using  n_jobs=None  MPIRE will spawn as many processes as there are CPUs on your system. Specifying
       more jobs than you have CPUs is, of course, possible as well.

       WARNING:
          In the manual approach, the results queue should be drained before joining the workers, otherwise  you
          can  get  a  deadlock.  If  you  want  to  join either way, use mpire.WorkerPool.terminate(). For more
          information, see the warnings in the Python docs here.

   Nested WorkerPools
       By default, the mpire.WorkerPool class spawns daemon child processes who are not  able  to  create  child
       processes  themselves,  so  nested  pools  are  not allowed. There’s an option to create non-daemon child
       processes to allow for nested structures:

          def job(...)
              with WorkerPool(n_jobs=4) as p:
                  # Do some work
                  results = p.map(...)

          with WorkerPool(n_jobs=4, daemon=True, start_method='spawn') as pool:
              # This will raise an AssertionError telling you daemon processes
              # can't start child processes
              pool.map(job, ...)

          with WorkerPool(n_jobs=4, daemon=False, start_method='spawn') as pool:
              # This will work just fine
              pool.map(job, ...)

       NOTE:
          Nested pools aren’t supported when using threading.

       WARNING:
          Spawning processes is not thread-safe! Both start and join methods of the process class  alter  global
          variables. If you still want to have nested pools, the safest bet is to use spawn as start method.

       NOTE:
          Due  to a strange bug in Python, using forkserver as start method in a nested pool is not allowed when
          the outer pool is using fork, as the forkserver will not have been started there. For it to work  your
          outer pool will have to have either spawn or forkserver as start method.

       WARNING:
          Nested  pools  aren’t production ready. Error handling and keyboard interrupts when using nested pools
          can, on some rare occassions (~1% of the time), still cause deadlocks. Use at your own risk.

          When a function is guaranteed to finish successfully, using nested pools is absolutely fine.

   Process start method
   Contents
       • Spawn and forkserver

       The multiprocessing package allows you to start processes using a few different methods: 'fork',  'spawn'
       or  'forkserver'.  Threading  is  also  available  by  using 'threading'. For detailed information on the
       multiprocessing contexts, please refer to the  multiprocessing  documentation  and  caveats  section.  In
       short:

       fork   Copies  the  parent  process  such  that the child process is effectively identical. This includes
              copying everything currently in memory. This is sometimes useful, but other times useless or  even
              a serious bottleneck. fork enables the use of copy-on-write shared objects (see Shared objects).

       spawn  Starts a fresh python interpreter where only those resources necessary are inherited.

       forkserver
              First starts a server process (using 'spawn'). Whenever a new process is needed the parent process
              requests the server to fork a new process.

       threading
              Starts  child  threads.  Suffers  from  the  Global Interpreter Lock (GIL), but works fine for I/O
              intensive tasks.

       For an overview of start method availability and defaults, please refer to the following table:
                             ┌──────────────┬───────────────────┬──────────────────────┐
                             │ Start method │ Available on Unix │ Available on Windows │
                             ├──────────────┼───────────────────┼──────────────────────┤
                             │ fork         │ Yes (default)     │ No                   │
                             ├──────────────┼───────────────────┼──────────────────────┤
                             │ spawn        │ Yes               │ Yes (default)        │
                             ├──────────────┼───────────────────┼──────────────────────┤
                             │ forkserver   │ Yes               │ No                   │
                             ├──────────────┼───────────────────┼──────────────────────┤
                             │ threading    │ Yes               │ Yes                  │
                             └──────────────┴───────────────────┴──────────────────────┘

   Spawn and forkserver
       When using spawn or forkserver as start method, be aware that global variables (constants are fine) might
       have a different value than you might expect.  You  also  have  to  import  packages  within  the  called
       function:

          import os

          def failing_job(folder, filename):
              return os.path.join(folder, filename)

          # This will fail because 'os' is not copied to the child processes
          with WorkerPool(n_jobs=2, start_method='spawn') as pool:
              pool.map(failing_job, [('folder', '0.p3'), ('folder', '1.p3')])

          def working_job(folder, filename):
              import os
              return os.path.join(folder, filename)

          # This will work
          with WorkerPool(n_jobs=2, start_method='spawn') as pool:
              pool.map(working_job, [('folder', '0.p3'), ('folder', '1.p3')])

       A  lot  of  effort  has been put into making the progress bar, dashboard, and nested pools (with multiple
       progress bars) work well with spawn and forkserver. So, everything should work fine.

   CPU pinning
       You can pin the child processes of mpire.WorkerPool to specific CPUs by using the  cpu_ids  parameter  in
       the constructor:

          # Pin the two child processes to CPUs 2 and 3
          with WorkerPool(n_jobs=2, cpu_ids=[2, 3]) as pool:
              ...

          # Pin the child processes to CPUs 40-59
          with WorkerPool(n_jobs=20, cpu_ids=list(range(40, 60))) as pool:
              ...

          # All child processes have to share a single core:
          with WorkerPool(n_jobs=4, cpu_ids=[0]) as pool:
              ...

          # All child processes have to share multiple cores, namely 4-7:
          with WorkerPool(n_jobs=4, cpu_ids=[[4, 5, 6, 7]]) as pool:
              ...

          # Each child process can use two distinctive cores:
          with WorkerPool(n_jobs=4, cpu_ids=[[0, 1], [2, 3], [4, 5], [6, 7]]) as pool:
              ...

       CPU  IDs have to be positive integers, not exceeding the number of CPUs available (which can be retrieved
       by using mpire.cpu_count()). Use None to disable CPU pinning (which is the default).

       NOTE:
          Pinning processes to CPU IDs doesn’t work when using threading or when you’re on macOS.

   Accessing the worker ID
   Contents
       • Elaborate example

       Each worker in MPIRE is given an integer ID to distinguish them. Worker #1 will have ID 0, #2  will  have
       ID 1, etc. Sometimes it can be useful to have access to this ID.

       By  default,  the  worker  ID is not passed on. You can enable/disable this by setting the pass_worker_id
       flag:

          def task(worker_id, x):
              pass

          with WorkerPool(n_jobs=4, pass_worker_id=True) as pool:
              pool.map(task, range(10))

       IMPORTANT:
          The worker ID will always be the first argument passed on to the provided function.

       Instead  of  passing  the  flag  to  the  mpire.WorkerPool  constructor  you  can  also   make   use   of
       mpire.WorkerPool.pass_on_worker_id():

          with WorkerPool(n_jobs=4) as pool:
              pool.pass_on_worker_id()
              pool.map(task, range(10))

   Elaborate example
       Here’s  a  more  elaborate example of using the worker ID together with a shared array, where each worker
       can only access the element corresponding to its worker ID, making the use of locking unnecessary:

          def square_sum(worker_id, shared_objects, x):
              # Even though the shared objects is a single container, we 'unpack' it anyway
              results_container = shared_objects

              # Square and sum
              results_container[worker_id] += x * x

          # Use a shared array of size equal to the number of jobs to store the results
          results_container = Array('f', 4, lock=False)

          with WorkerPool(n_jobs=4, shared_objects=results_container, pass_worker_id=True) as pool:
              # Square the results and store them in the results container
              pool.map_unordered(square_sum, range(100))

   Shared objects
   Contents
       • Copy-on-write alternatives

       MPIRE allows you to provide shared objects to the workers in a  similar  way  as  is  possible  with  the
       multiprocessing.Process   class.  For  the  start  method  fork  these  shared  objects  are  treated  as
       copy-on-write, which means they are only copied once changes are made to them. Otherwise they  share  the
       same  memory  address. This is convenient if you want to let workers access a large dataset that wouldn’t
       fit in memory when copied multiple times.

       NOTE:
          The start method fork isn’t available on Windows, which means copy-on-write isn’t supported there.

       For threading these shared objects are readable and writable without copies being  made.  For  the  start
       methods  spawn  and forkserver the shared objects are copied once for each worker, in contrast to copying
       it for each task which is done when using a regular multiprocessing.Pool.

          def task(dataset, x):
              # Do something with this copy-on-write dataset
              ...

          def main():
              dataset = ... # Load big dataset
              with WorkerPool(n_jobs=4, shared_objects=dataset, start_method='fork') as pool:
                  ... = pool.map(task, range(100))

       Multiple objects can be provided by placing them, for example, in a tuple container.

       Apart  from  sharing  regular  Python  objects  between  workers,  you  can  also  share  multiprocessing
       synchronization  primitives such as multiprocessing.Lock using this method. Objects like these require to
       be shared through inheritance, which is exactly how shared objects in MPIRE are passed on.

       IMPORTANT:
          Shared objects are passed on as the second argument, after  the  worker  ID  (when  enabled),  to  the
          provided function.

       Instead  of  passing  the  shared  objects  to  the  mpire.WorkerPool  constructor  you  can also use the
       mpire.WorkerPool.set_shared_objects() function:

          def main():
              dataset = ... # Load big dataset
              with WorkerPool(n_jobs=4, start_method='fork') as pool:
                  pool.set_shared_objects(dataset)
                  ... = pool.map(task, range(100))

       Shared objects have to be specified before the workers are started. Workers are started  once  the  first
       map  call  is  executed.  When  keep_alive=True  and  the workers are reused, changing the shared objects
       between two consecutive map calls won’t work.

   Copy-on-write alternatives
       When  copy-on-write  is  not  available  for  you,  you  can  also  use  shared  objects   to   share   a
       multiprocessing.Array,  multiprocessing.Value,  or  another  object with multiprocessing.Manager. You can
       then store results in the same object from multiple processes. However, you should  keep  the  amount  of
       synchronization  to  a  minimum  when the resources are protected with a lock, or disable locking if your
       situation allows it as is shown here:

          from multiprocessing import Array

          def square_add_and_modulo_with_index(shared_objects, idx, x):
              # Unpack results containers
              square_results_container, add_results_container = shared_objects

              # Square, add and modulo
              square_results_container[idx] = x * x
              add_results_container[idx] = x + x
              return x % 2

          def main():
              # Use a shared array of size 100 and type float to store the results
              square_results_container = Array('f', 100, lock=False)
              add_results_container = Array('f', 100, lock=False)
              shared_objects = square_results_container, add_results_container
              with WorkerPool(n_jobs=4, shared_objects=shared_objects) as pool:

                  # Square, add and modulo the results and store them in the results containers
                  modulo_results = pool.map(square_add_and_modulo_with_index,
                                            enumerate(range(100)), iterable_len=100)

       In the example above we create two results containers, one for squaring and for adding the  given  value,
       and disable locking for both. Additionally, we also return a value, even though we use shared objects for
       storing  results.  We  can  safely  disable  locking here as each task writes to a different index in the
       array, so no race conditions can occur.  Disabling locking is, of course, a lot  faster  than  having  it
       enabled.

   Worker state
   Contents
       • Combining worker state with worker_init and worker_exit

       • Combining worker state with keep_alive

       If you want to let each worker have its own state you can use the use_worker_state flag:

          def task(worker_state, x):
              if "local_sum" not in worker_state:
                  worker_state["local_sum"] = 0
              worker_state["local_sum"] += x

          with WorkerPool(n_jobs=4, use_worker_state=True) as pool:
              results = pool.map(task, range(100))

       IMPORTANT:
          The  worker  state  is  passed  on as the third argument, after the worker ID and shared objects (when
          enabled), to the provided function.

       Instead  of  passing  the  flag  to  the  mpire.WorkerPool  constructor  you  can  also   make   use   of
       mpire.WorkerPool.set_use_worker_state():

          with WorkerPool(n_jobs=4) as pool:
              pool.set_use_worker_state()
              pool.map(task, range(100))

   Combining worker state with worker_init and worker_exit
       The  worker  state  can be combined with the worker_init and worker_exit parameters of each map function,
       leading to some really useful capabilities:

          import numpy as np
          import pickle

          def load_big_model(worker_state):
              # Load a model which takes up a lot of memory
              with open('./a_really_big_model.p3', 'rb') as f:
                  worker_state['model'] = pickle.load(f)

          def model_predict(worker_state, x):
              # Predict
              return worker_state['model'].predict(x)

          with WorkerPool(n_jobs=4, use_worker_state=True) as pool:
              # Let the model predict
              data = np.array([[...]])
              results = pool.map(model_predict, data, worker_init=load_big_model)

       More information about the worker_init and worker_exit parameters can be found at Worker init and exit.

   Combining worker state with keep_alive
       By default, workers are restarted each time a map function is executed. As described in Keep  alive  this
       can  be  circumvented by using keep_alive=True. This also ensures worker state is kept across consecutive
       map calls:

          with WorkerPool(n_jobs=4, use_worker_state=True, keep_alive=True) as pool:
              # Let the model predict
              data = np.array([[...]])
              results = pool.map(model_predict, data, worker_init=load_big_model)

              # Predict some more
              more_data = np.array([[...]])
              more_results = pool.map(model_predict, more_data)

       In this example we don’t need to supply the worker_init function to the second map call, as  the  workers
       will be reused. When worker_lifespan is set, though, this rule doesn’t apply.

   Keep alive
   Contents
       • Caveats

       By  default,  workers  are  restarted  on each map call. This is done to clean up resources as quickly as
       possible when the work is done.

       Workers can be kept alive in between consecutive map calls using the keep_alive flag. This is useful when
       your workers have a long startup time and you need to call one of the map functions multiple times.

          def foo(x):
              pass

          with WorkerPool(n_jobs=4, keep_alive=True) as pool:
              pool.map(task, range(100))
              pool.map(task, range(100))  # Workers are reused here

       Instead  of  passing  the  flag  to  the  mpire.WorkerPool  constructor  you  can  also   make   use   of
       mpire.WorkerPool.set_keep_alive():

          with WorkerPool(n_jobs=4) as pool:
              pool.map(task, range(100))
              pool.map(task, range(100))  # Workers are restarted
              pool.set_keep_alive()
              pool.map(task, range(100))  # Workers are reused here

   Caveats
       Changing   some   WorkerPool  init  parameters  do  require  a  restart.  These  include  pass_worker_id,
       shared_objects, and use_worker_state.

       Keeping workers alive works even when the function to be called or any other parameter passed on  to  the
       map function changes.

       However,  when  you’re  changing  either  the worker_init and/or worker_exit function while keep_alive is
       enabled, you need to be aware this can  have  undesired  side-effects.  worker_init  functions  are  only
       executed  when a worker is started and worker_exit functions when a worker is terminated. When keep_alive
       is enabled, workers aren’t restarted in between consecutive map calls, so those functions are not called.

          def init_func_1(): pass
          def exit_func_1(): pass

          def init_func_2(): pass
          def init_func_2(): pass

          with WorkerPool(n_jobs=4, keep_alive=True) as pool:
              pool.map(task, range(100), worker_init=init_func_1, worker_exit=exit_func_1)
              pool.map(task, range(100), worker_init=init_func_2, worker_exit=exit_func_2)

       In the above example init_func_1 is called for each worker when the workers are started. After the  first
       map call exit_func_1 is not called because workers are kept alive. During the second map call init_func_2
       isn’t  called  as well, because the workers are still alive. When exiting the context manager the workers
       are shut down and exit_func_2 is called.

       It gets even trickier when you also enable worker_lifespan. In this scenario during the first map call  a
       worker  could’ve  reached  its  maximum lifespan and is forced to restart, while others haven’t. The exit
       function of the worker to be restarted is called (i.e., exit_func_1). When calling  map  for  the  second
       time  and  the  exit  function is changed, the other workers will execute the new exit function when they
       need to be restarted (i.e., exit_func_2).

   Worker insights
       Worker insights gives you insight in your multiprocessing efficiency by tracking worker  start  up  time,
       waiting  time  and  time spend on executing tasks. Tracking is disabled by default, but can be enabled by
       setting enable_insights:

          with WorkerPool(n_jobs=4, enable_insights=True) as pool:
              pool.map(task, range(100))

       The overhead is very minimal and you shouldn’t really notice it, even on very small tasks. You  can  view
       the  tracking  results  using mpire.WorkerPool.get_insights() or use mpire.WorkerPool.print_insights() to
       directly print the insights to console:

          import time

          def sleep_and_square(x):
              # For illustration purposes
              time.sleep(x / 1000)
              return x * x

          with WorkerPool(n_jobs=4, enable_insights=True) as pool:
              pool.map(sleep_and_square, range(100))
              insights = pool.get_insights()
              print(insights)

          # Output:
          {'n_completed_tasks': [28, 24, 24, 24],
           'total_start_up_time': '0:00:00.038',
           'total_init_time': '0:00:00',
           'total_waiting_time': '0:00:00.798',
           'total_working_time': '0:00:04.980',
           'total_exit_time': '0:00:00',
           'total_time': '0:00:05.816',
           'start_up_time': ['0:00:00.010', '0:00:00.008', '0:00:00.008', '0:00:00.011'],
           'start_up_time_mean': '0:00:00.009',
           'start_up_time_std': '0:00:00.001',
           'start_up_ratio': 0.006610452621805033,
           'init_time': ['0:00:00', '0:00:00', '0:00:00', '0:00:00'],
           'init_time_mean': '0:00:00',
           'init_time_std': '0:00:00',
           'init_ratio': 0.0,
           'waiting_time': ['0:00:00.309', '0:00:00.311', '0:00:00.165', '0:00:00.012'],
           'waiting_time_mean': '0:00:00.199',
           'waiting_time_std': '0:00:00.123',
           'waiting_ratio': 0.13722942739284952,
           'working_time': ['0:00:01.142', '0:00:01.135', '0:00:01.278', '0:00:01.423'],
           'working_time_mean': '0:00:01.245',
           'working_time_std': '0:00:00.117',
           'working_ratio': 0.8561601182661567,
           'exit_time': ['0:00:00', '0:00:00', '0:00:00', '0:00:00']
           'exit_time_mean': '0:00:00',
           'exit_time_std': '0:00:00',
           'exit_ratio': 0.0,
           'top_5_max_task_durations': ['0:00:00.099', '0:00:00.098', '0:00:00.097', '0:00:00.096',
                                        '0:00:00.095'],
           'top_5_max_task_args': ['Arg 0: 99', 'Arg 0: 98', 'Arg 0: 97', 'Arg 0: 96', 'Arg 0: 95']}

       We specified 4 workers, so there are  4  entries  in  the  n_completed_tasks,  start_up_time,  init_time,
       waiting_time, working_time, and exit_time containers. They show per worker the number of completed tasks,
       the total start up time, the total time spend on the worker_init function, the total time waiting for new
       tasks,  total  time  spend  on  main  function,  and  the  total  time spend on the worker_exit function,
       respectively. The insights also contain mean, standard deviation, and ratio  of  the  tracked  time.  The
       ratio  is  the time for that part divided by the total time. In general, the higher the working ratio the
       more efficient your multiprocessing setup is. Of course, your setup might still not  be  optimal  because
       the task itself is inefficient, but timing that is beyond the scope of MPIRE.

       Additionally,  the insights keep track of the top 5 tasks that took the longest to run. The data is split
       up in two containers: one for the duration and one for the arguments that were  passed  on  to  the  task
       function. Both are sorted based on task duration (desc), so index 0 of the args list corresponds to index
       0 of the duration list, etc.

       When  using  the  MPIRE  Dashboard  you  can  track  these  insights in real-time. See Dashboard for more
       information.

       NOTE:
          When  using  imap  or  imap_unordered  you  can  view  the  insights  during  execution.  Simply  call
          get_insights() or print_insights() inside your loop where you process the results.

   Dill
       For  some  functions  or  tasks  it  can  be  useful  to  not  rely  on pickle, but on some more powerful
       serialization backends like dill. dill isn’t installed by default.  See  Dill  for  more  information  on
       installing the dependencies.

       One  specific  example  where  dill  shines  is when using start method spawn (the default on Windows) in
       combination with iPython or Jupyter notebooks.  dill  enables  parallelizing  more  exotic  objects  like
       lambdas and functions defined in iPython and Jupyter notebooks. For all benefits of dill, please refer to
       the dill documentation.

       Once the dependencies have been installed, you can enable it using the use_dill flag:

          with WorkerPool(n_jobs=4, use_dill=True) as pool:
              ...

       NOTE:
          When  using  dill  it can potentially slow down processing. This is the cost of having a more reliable
          and powerful serialization backend.

   Order tasks
       In some settings it can be useful to supply the tasks to workers in a  round-robin  fashion.  This  means
       worker  0  will  get  task  0, worker 1 will get task 1, etc. After each worker got a task, we start with
       worker 0 again instead of picking the worker that has most recently completed a task.

       When the chunk size is larger than 1, the tasks are distributed to the workers in order, but  in  chunks.
       I.e.,  when  chunk_size=3  tasks 0, 1, and 2 will be assigned to worker 0, tasks 3, 4, and 5 to worker 1,
       and so on.

       When keep_alive is set to True and the second map call is made, MPIRE resets the worker order and  starts
       at worker 0 again.

       WARNING:
          When  tasks  vary  in  execution  time,  the default task scheduler makes sure each worker is busy for
          approximately the same amount of time. This can mean that some workers execute more tasks than others.
          When using order_tasks this is no longer the case and therefore the total execution time is likely  to
          be higher.

       You can enable/disable task ordering by setting the order_tasks flag:

          def task(x):
              pass

          with WorkerPool(n_jobs=4, order_tasks=True) as pool:
              pool.map(task, range(10))

       Instead   of   passing   the  flag  to  the  mpire.WorkerPool  constructor  you  can  also  make  use  of
       mpire.WorkerPool.set_order_tasks():

          with WorkerPool(n_jobs=4) as pool:
              pool.set_order_tasks()
              pool.map(task, range(10))

   Map family
       This section describes the different ways of interacting with a mpire.WorkerPool instance.

   map family of functions
   Contents
       • Iterable of arguments

       • Circumvent argument unpacking

       • Mixing map functions

       • Not exhausting a lazy imap function

       mpire.WorkerPool implements four types of parallel map functions, being:

       mpire.WorkerPool.map()
              Blocks until results are ready, results are ordered in the same way as the provided arguments.

       mpire.WorkerPool.map_unordered()
              The same as mpire.WorkerPool.map(), but results are  ordered  by  task  completion  time.  Usually
              faster than mpire.WorkerPool.map().

       mpire.WorkerPool.imap()
              Lazy  version of mpire.WorkerPool.map(), returns a generator. The generator will give results back
              whenever new results are ready. Results are ordered in the same way as the provided arguments.

       mpire.WorkerPool.imap_unordered()
              The same as mpire.WorkerPool.imap(), but results are ordered  by  task  completion  time.  Usually
              faster than mpire.WorkerPool.imap().

       When using a single worker the unordered versions are equivalent to their ordered counterparts.

   Iterable of arguments
       Each  map  function  should  receive  a  function and an iterable of arguments, where the elements of the
       iterable can be single values or iterables that are unpacked as arguments. If an element is a dictionary,
       the (key, value) pairs will be unpacked with the **-operator.

          def square(x):
              return x * x

          with WorkerPool(n_jobs=4) as pool:
              # 1. Square the numbers, results should be: [0, 1, 4, 9, 16, 25, ...]
              results = pool.map(square, range(100))

       The first example should work as expected, the numbers are simply squared. MPIRE  knows  how  many  tasks
       there are because a range object implements the __len__ method (see Task chunking).

          with WorkerPool(n_jobs=4) as pool:
              # 2. Square the numbers, results should be: [0, 1, 4, 9, 16, 25, ...]
              # Note: don't execute this, it will take a long time ...
              results = pool.map(square, range(int(1e30)), iterable_len=int(1e30), chunk_size=1)

       In  the  second  example the 1e30 number is too large for Python: try calling len(range(int(1e30))), this
       will throw an OverflowError (don’t get me started …). Therefore, we must use the  iterable_len  parameter
       to  let  MPIRE  know  how large the tasks list is. We also have to specify a chunk size here as the chunk
       size should be lower than sys.maxsize.

          def multiply(x, y):
              return x * y

          with WorkerPool(n_jobs=4) as pool:
              # 3. Multiply the numbers, results should be [0, 101, 204, 309, 416, ...]
              for result in pool.imap(multiply, zip(range(100), range(100, 200)), iterable_len=100):
                  ...

       The third example shows an example of using multiple function arguments. Note that we use  imap  in  this
       example,  which allows us to process the results whenever they come available, not having to wait for all
       results to be ready.

          with WorkerPool(n_jobs=4) as pool:
              # 4. Multiply the numbers, results should be [0, 101, ...]
              for result in pool.imap(multiply, [{'x': 0, 'y': 100}, {'y': 101, 'x': 1}, ...]):
                  ...

       The final example shows the use of an iterable of dictionaries. The (key, value) pairs are unpacked  with
       the **-operator, as you would expect. So it doesn’t matter in what order the keys are stored. This should
       work for collection.OrderedDict objects as well.

   Circumvent argument unpacking
       If  you  want  to  avoid unpacking and pass the tuples in example 3 or the dictionaries in example 4 as a
       whole, you can.  We’ll continue on example 4, but the workaround for example 3 is similar.

       Suppose we have the following function which expects a dictionary:

          def multiply_dict(d):
              return d['x'] * d['y']

       Then you would have to convert the list of dictionaries to a list of single argument tuples,  where  each
       argument is a dictionary:

          with WorkerPool(n_jobs=4) as pool:
              # Multiply the numbers, results should be [0, 101, ...]
              for result in pool.imap(multiply_dict, [({'x': 0, 'y': 100},),
                                                      ({'y': 101, 'x': 1},),
                                                      ...]):
                  ...

       There is a utility function available that does this transformation for you:

          from mpire.utils import make_single_arguments

          with WorkerPool(n_jobs=4) as pool:
              # Multiply the numbers, results should be [0, 101, ...]
              for result in pool.imap(multiply_dict, make_single_arguments([{'x': 0, 'y': 100},
                                                                            {'y': 101, 'x': 1}, ...],
                                                                           generator=False)):
                  ...

       mpire.utils.make_single_arguments()  expects  an  iterable  of  arguments  and  converts  them  to tuples
       accordingly. The second argument of this function  specifies  if  you  want  the  function  to  return  a
       generator  or  a  materialized  list. If we would like to return a generator we would need to pass on the
       iterable length as well.

   Mixing map functions
       map functions cannot be used while another map function is still running. E.g., the following will  raise
       an exception:

          with WorkerPool(n_jobs=4) as pool:
              imap_results = pool.imap(multiply, zip(range(100), range(100, 200)), iterable_len=100)
              next(imap_results)  # We actually have to start the imap function

              # Will raise because the imap function is still running
              map_results = pool.map(square, range(100))

       Make  sure  to  first finish the imap function before starting a new map function. This holds for all map
       functions.

   Not exhausting a lazy imap function
       If you don’t exhaust a lazy imap function, but do close the pool, the remaining tasks and results will be
       lost.  E.g., the following will raise an exception:

          with WorkerPool(n_jobs=4) as pool:
              imap_results = pool.imap(multiply, zip(range(100), range(100, 200)), iterable_len=100)
              first_result = next(imap_results)  # We actually have to start the imap function
              pool.terminate()

              # This will raise
              results = list(imap_results)

       Similarly, exiting the with block terminates the pool as well:

          with WorkerPool(n_jobs=4) as pool:
              imap_results = pool.imap(multiply, zip(range(100), range(100, 200)), iterable_len=100)
              first_result = next(imap_results)  # We actually have to start the imap function

          # This will raise
          results = list(imap_results)

   Progress bar
   Contents
       • Progress bar style

         • Changing the default style

       • Progress bar options

         • Progress bar position

       Progress bar support is added through the tqdm package (installed by default when installing MPIRE).  The
       most easy way to include a progress bar is by enabling the progress_bar flag in any of the map functions:

          with WorkerPool(n_jobs=4) as pool:
              pool.map(task, range(100), progress_bar=True)

       This  will  display  a basic tqdm progress bar displaying the time elapsed and remaining, number of tasks
       completed (including a percentage value) and the speed (i.e., number of tasks completed per time unit).

   Progress bar style
       You can switch to a different progress bar  style  by  changing  the  progress_bar_style  parameter.  For
       example, when you require a notebook widget use 'notebook' as the style:

          with WorkerPool(n_jobs=4) as pool:
              pool.map(task, range(100), progress_bar=True, progress_bar_style='notebook')

       The available styles are:

       • None: use the default style (= 'std' , see below)

       • 'std': use the standard tqdm progress bar

       • 'rich': use the rich progress bar (requires the rich package to be installed, see Rich progress bars)

       • 'notebook': use the Jupyter notebook widget

       • 'dashboard': use only the progress bar on the dashboard

       When  in  a  terminal  and  using the 'notebook' style, the progress bar will behave weirdly. This is not
       recommended.

       NOTE:
          If you run into problems with getting the progress bar to work in a Jupyter notebook (with  'notebook'
          style), have a look at Progress bar issues with Jupyter notebooks.

   Changing the default style
       You can change the default style by setting the mpire.tqdm_utils.PROGRESS_BAR_DEFAULT_STYLE variable:

          import mpire.tqdm_utils

          mpire.tqdm_utils.PROGRESS_BAR_DEFAULT_STYLE = 'notebook'

   Progress bar options
       The  tqdm progress bar can be configured using the progress_bar_options parameter. This parameter accepts
       a dictionary with keyword arguments that will be passed to the tqdm constructor.

       Some options in tqdm will  be  overwritten  by  MPIRE.  These  include  the  iterable,  total  and  leave
       parameters. The iterable is set to the iterable passed on to the map function. The total parameter is set
       to  the number of tasks to be completed. The leave parameter is always set to True. Some other parameters
       have a default value assigned to them, but can be overwritten by the user.

       Here’s an example where we change the description, the units, and the colour of the progress bar:

          with WorkerPool(n_jobs=4) as pool:
              pool.map(some_func, some_data, progress_bar=True,
                       progress_bar_options={'desc': 'Processing', 'unit': 'items', 'colour': 'green'})

       For a complete list of available options, check out the tqdm docs.

   Progress bar position
       You can easily print a progress bar on a different position on the terminal using the position  parameter
       of  tqdm,  which  facilitates  the  use  of  multiple  progress bars. Here’s an example of using multiple
       progress bars using nested WorkerPools:

          def dispatcher(worker_id, X):
              with WorkerPool(n_jobs=4) as nested_pool:
                  return nested_pool.map(task, X, progress_bar=True,
                                         progress_bar_options={'position': worker_id + 1})

          def main():
              with WorkerPool(n_jobs=4, daemon=False, pass_worker_id=True) as pool:
                  pool.map(dispatcher, ((range(x, x + 100),) for x in range(100)), iterable_len=100,
                           n_splits=4, progress_bar=True)

          main()

       We use worker_id + 1 here because the worker IDs start at zero and we reserve position 0 for the progress
       bar of the main WorkerPool (which is the default).

       It goes without saying that you shouldn’t specify the same progress bar position multiple times.

       NOTE:
          When using the rich progress bar style, the position parameter cannot be used. An  exception  will  be
          raised when trying to do so.

       NOTE:
          Most  progress bar options are completely ignored when in a Jupyter/IPython notebook session or in the
          MPIRE dashboard.

   Worker init and exit
       When you want to initialize a worker you can make use of the worker_init parameter of any  map  function.
       This  will  call the initialization function only once per worker. Similarly, if you need to clean up the
       worker at the end of its lifecycle you can use the worker_exit parameter. Additionally, the exit function
       can return anything you like, which can be collected using mpire.WorkerPool.get_exit_results() after  the
       workers are done.

       Both  init  and exit functions receive the worker ID, shared objects, and worker state in the same way as
       the task function does, given they’re enabled.

       For example:

          def init_func(worker_state):
              # Initialize a counter for each worker
              worker_state['count_even'] = 0

          def square_and_count_even(worker_state, x):
              # Count number of even numbers and return the square
              if x % 2 == 0:
                  worker_state['count_even'] += 1
              return x * x

          def exit_func(worker_state):
              # Return the counter
              return worker_state['count_even']

          with WorkerPool(n_jobs=4, use_worker_state=True) as pool:
              pool.map(square_and_count_even, range(100), worker_init=init_func, worker_exit=exit_func)
              print(pool.get_exit_results())  # Output, e.g.: [13, 13, 12, 12]
              print(sum(pool.get_exit_results()))  # Output: 50

       IMPORTANT:
          When the worker_lifespan option is used to restart workers during execution, the exit function will be
          called for the worker that’s shutting down and the init function will be  called  again  for  the  new
          worker.    Therefore,    the    number    of    elements    in   the   list   that’s   returned   from
          mpire.WorkerPool.get_exit_results() does not always equal n_jobs.

       IMPORTANT:
          When keep_alive is enabled the workers won’t be terminated after a  map  call.  This  means  the  exit
          function  won’t be called until it’s time for cleaning up the entire pool. You will have to explicitly
          call mpire.WorkerPool.stop_and_join() to receive the exit results.

   Task chunking
       By default, MPIRE chunks the given tasks in to 64 * n_jobs chunks. Each worker  is  given  one  chunk  of
       tasks  at  a time before returning its results. This usually makes processing faster when you have rather
       small tasks (computation wise) and results are pickled/unpickled when they are send to a worker  or  main
       process. Chunking the tasks and results ensures that each process has to pickle/unpickle less often.

       However,  to determine the number of tasks in the argument list the iterable should implement the __len__
       method, which is available in default containers  like  list  or  tuple,  but  isn’t  available  in  most
       generator  objects (the range object is one of the exceptions). To allow working with generators each map
       function has the option to pass the iterable length:

          with WorkerPool(n_jobs=4) as pool:
              # 1. This will issue a warning and sets the chunk size to 1
              results = pool.map(square, ((x,) for x in range(1000)))

              # 2. This will issue a warning as well and sets the chunk size to 1
              results = pool.map(square, ((x,) for x in range(1000)), n_splits=4)

              # 3. Square the numbers using a generator using a specific number of splits
              results = pool.map(square, ((x,) for x in range(1000)), iterable_len=1000, n_splits=4)

              # 4. Square the numbers using a generator using automatic chunking
              results = pool.map(square, ((x,) for x in range(1000)), iterable_len=1000)

              # 5. Square the numbers using a generator using a fixed chunk size
              results = pool.map(square, ((x,) for x in range(1000)), chunk_size=4)

       In the first two examples the function call will issue a warning because MPIRE doesn’t know how large the
       chunks should be as the total number of tasks is unknown, therefore it will fall back to a chunk size  of
       1.  The third example should work as expected where 4 chunks are used. The fourth example uses 256 chunks
       (the default 64 times the number of workers). The last example uses a fixed chunk size of four, so  MPIRE
       doesn’t need to know the iterable length.

       You can also call the chunk function manually:

          from mpire.utils import chunk_tasks

          # Convert to list because chunk_tasks returns a generator
          print(list(chunk_tasks(range(10), n_splits=3)))
          print(list(chunk_tasks(range(10), chunk_size=2.5)))
          print(list(chunk_tasks((x for x in range(10)), iterable_len=10, n_splits=6)))

       will output:

          [(0, 1, 2, 3), (4, 5, 6), (7, 8, 9)]
          [(0, 1, 2), (3, 4), (5, 6, 7), (8, 9)]
          [(0, 1), (2, 3), (4,), (5, 6), (7, 8), (9,)]

   Maximum number of active tasks
       When you have tasks that take up a lot of memory you can do a few things:

       • Limit  the  number  of  jobs (i.e., the number of tasks currently being available to the workers, tasks
         that are in the queue ready to be processed).

       • Limit the number of active tasks

       The first option is the most obvious one to save memory when the processes themselves use up much memory.
       The second is convenient when the argument list takes up too much memory. For example, suppose  you  want
       to kick off an enormous amount of jobs (let’s say a billion) of which the arguments take up 1 KB per task
       (e.g., large strings), then that task queue would take up ~1 TB of memory!

       In  such  cases,  a  good  rule of thumb would be to have twice the amount of active chunks of tasks than
       there are jobs.  This means that when all workers complete  their  task  at  the  same  time  each  would
       directly  be  able  to  continue with another task. When workers take on their new tasks the generator of
       tasks is iterated to the point that again there would be twice the amount of active chunks of tasks.

       In MPIRE, the maximum number of active tasks by default is set to n_jobs * chunk_size * 2, so  you  don’t
       have  to tweak it for memory optimization. If, for whatever reason, you want to change this behavior, you
       can do so by setting the max_active_tasks parameter:

          with WorkerPool(n_jobs=4) as pool:
              results = pool.map(task, range(int(1e300)), iterable_len=int(1e300),
                                 chunk_size=int(1e5), max_tasks_active=4 * int(1e5))

       NOTE:
          Setting the max_tasks_active parameter to a value lower than n_jobs * chunk_size can  result  in  some
          workers not being able to do anything.

   Worker lifespan
       Occasionally,  workers  that process multiple, memory intensive tasks do not release their used up memory
       properly, which results in memory usage building up. This is not a bug in MPIRE,  but  a  consequence  of
       Python’s  poor  garbage  collection.   To avoid this type of problem you can set the worker lifespan: the
       number of tasks after which a worker should restart.

          with WorkerPool(n_jobs=4) as pool:
              results = pool.map(task, range(100), worker_lifespan=1, chunk_size=1)

       In this example each worker is restarted after finishing a single task.

       NOTE:
          When the worker lifespan has been reached, a worker will finish the  current  chunk  of  tasks  before
          restarting.  I.e., based on the chunk_size a worker could end up completing more tasks than is allowed
          by the worker lifespan.

   Timeouts
       Timeouts can be set separately for the target, worker_init and worker_exit functions. When a timeout  has
       been set and reached, it will throw a TimeoutError:

          # Will raise TimeoutError, provided that the target function takes longer
          # than half a second to complete
          with WorkerPool(n_jobs=5) as pool:
              pool.map(time_consuming_function, range(10), task_timeout=0.5)

          # Will raise TimeoutError, provided that the worker_init function takes longer
          # than 3 seconds to complete or the worker_exit function takes longer than
          # 150.5 seconds to complete
          with WorkerPool(n_jobs=5) as pool:
              pool.map(time_consuming_function, range(10), worker_init=init, worker_exit=exit_,
                       worker_init_timeout=3.0, worker_exit_timeout=150.5)

       Use None (=default) to disable timeouts.

   imap and imap_unordered
       When  you’re  using  one  of the lazy map functions (e.g., imap or imap_unordered) then an exception will
       only be raised when the function is actually running. E.g. when executing:

          with WorkerPool(n_jobs=5) as pool:
              results = pool.imap(time_consuming_function, range(10), task_timeout=0.5)

       this will never raise. This is because imap and imap_unordered return a  generator  object,  which  stops
       executing until it gets the trigger to go beyond the yield statement. When iterating through the results,
       it will raise as expected:

          with WorkerPool(n_jobs=5) as pool:
              results = pool.imap(time_consuming_function, range(10), task_timeout=0.5)
              for result in results:
                  ...

   Threading
       When using threading as start method MPIRE won’t be able to interrupt certain functions, like time.sleep.

   Numpy arrays
   Contents
       • Chunking

       • Return value

   Chunking
       Numpy arrays are treated a little bit differently when passed on to the map functions. Usually MPIRE uses
       itertools.islice  for  chunking,  which depends on the __iter__ special function of the container object.
       But applying that to numpy arrays:

          import numpy as np

          # Create random array
          arr = np.random.rand(10, 3)

          # Chunk the array using default chunking
          arr_iter = iter(arr)
          chunk_size = 3
          while True:
              chunk = list(itertools.islice(arr_iter, chunk_size))
              if chunk:
                  yield chunk
              else:
                  break

       would yield:

          [array([0.68438994, 0.9701514 , 0.40083965]), array([0.88428556, 0.2083905 , 0.61490443]),
           array([0.89249174, 0.39902235, 0.70762541])]
          [array([0.18850964, 0.1022777 , 0.41539432]), array([0.07327858, 0.18608165, 0.75862301]),
           array([0.69215651, 0.4211941 , 0.31029439])]
          [array([0.82571272, 0.72257819, 0.86079131]), array([0.91285817, 0.49398461, 0.27863929]),
           array([0.146981  , 0.84671211, 0.30122806])]
          [array([0.11783283, 0.12585031, 0.39864368])]

       In other words, each row of the array is now in its own array and each one of them is given to the target
       function individually. Instead, MPIRE will chunk them in to something more reasonable using numpy slicing
       instead:

          from mpire.utils import chunk_tasks

          for chunk in chunk_tasks(arr, chunk_size=chunk_size):
              print(repr(chunk))

       Output:

          array([[0.68438994, 0.9701514 , 0.40083965],
                 [0.88428556, 0.2083905 , 0.61490443],
                 [0.89249174, 0.39902235, 0.70762541]])
          array([[0.18850964, 0.1022777 , 0.41539432],
                 [0.07327858, 0.18608165, 0.75862301],
                 [0.69215651, 0.4211941 , 0.31029439]])
          array([[0.82571272, 0.72257819, 0.86079131],
                 [0.91285817, 0.49398461, 0.27863929],
                 [0.146981  , 0.84671211, 0.30122806]])
          array([[0.11783283, 0.12585031, 0.39864368]])

       Each chunk is now a single numpy array containing as many rows as the chunk size,  except  for  the  last
       chunk as there aren’t enough rows left.

   Return value
       When  the  user  provided  function  returns  numpy arrays and you’re applying the mpire.WorkerPool.map()
       function MPIRE will concatenate the resulting numpy arrays to a single array by default. For example:

          def add_five(x):
              return x + 5

          with WorkerPool(n_jobs=4) as pool:
              results = pool.map(add_five, arr, chunk_size=chunk_size)

       will return:

          array([[5.68438994, 5.9701514 , 5.40083965],
                 [5.88428556, 5.2083905 , 5.61490443],
                 [5.89249174, 5.39902235, 5.70762541],
                 [5.18850964, 5.1022777 , 5.41539432],
                 [5.07327858, 5.18608165, 5.75862301],
                 [5.69215651, 5.4211941 , 5.31029439],
                 [5.82571272, 5.72257819, 5.86079131],
                 [5.91285817, 5.49398461, 5.27863929],
                 [5.146981  , 5.84671211, 5.30122806],
                 [5.11783283, 5.12585031, 5.39864368]])

       This behavior can be cancelled by using the concatenate_numpy_output flag:

          with WorkerPool(n_jobs=4) as pool:
              results = pool.map(add_five, arr, chunk_size=chunk_size, concatenate_numpy_output=False)

       This will return individual arrays:

          [array([[5.68438994, 5.9701514 , 5.40083965],
                  [5.88428556, 5.2083905 , 5.61490443],
                  [5.89249174, 5.39902235, 5.70762541]]),
           array([[5.18850964, 5.1022777 , 5.41539432],
                  [5.07327858, 5.18608165, 5.75862301],
                  [5.69215651, 5.4211941 , 5.31029439]]),
           array([[5.82571272, 5.72257819, 5.86079131],
                  [5.91285817, 5.49398461, 5.27863929],
                  [5.146981  , 5.84671211, 5.30122806]]),
           array([[5.11783283, 5.12585031, 5.39864368]])]

   Apply family
   Contents
       • apply

       • apply_async

       • AsyncResult

       • Callbacks

       • Worker init and exit

       • Timeouts

       mpire.WorkerPool  implements  two  apply  functions,  which  are  very  similar  to  the  ones   in   the
       multiprocessing module:

       mpire.WorkerPool.apply()
              Apply a function to a single task. This is a blocking call.

       mpire.WorkerPool.apply_async()
              A  variant of the above, but which is non-blocking. This returns an mpire.async_result.AsyncResult
              object.

   apply
       The apply function is a blocking call, which means that it will not return until the task  is  completed.
       If you want to run multiple different tasks in parallel, you should use the apply_async function instead.
       If you require to run the same function for many tasks in parallel, use the map functions instead.

       The  apply  function  takes  a  function,  positional  arguments,  and  keyword arguments, similar to how
       multiprocessing does it.

          def task(a, b, c, d):
              return a + b + c + d

          with WorkerPool(n_jobs=1) as pool:
              result = pool.apply(task, args=(1, 2), kwargs={'d': 4, 'c': 3})
              print(result)

   apply_async
       The apply_async function is a non-blocking call, which means that it will return immediately. It  returns
       an  mpire.async_result.AsyncResult  object,  which  can  be used to get the result of the task at a later
       moment in time.

       The apply_async function takes the same parameters as the apply function.

          def task(a, b):
              return a + b

          with WorkerPool(n_jobs=4) as pool:
              async_results = [pool.apply_async(task, args=(i, i)) for i in range(10)]
              results = [async_result.get() for async_result in async_results]

       Obtaining the results should happen while the pool is still running! E.g., the following will deadlock:

          with WorkerPool(n_jobs=4) as pool:
              async_results = [pool.apply_async(task, args=(i, i)) for i in range(10)]

          # Will wait forever
          results = [async_result.get() for async_result in async_results]

       You can, however, make use of the mpire.WorkerPool.stop_and_join() function to stop the workers and  join
       the pool. This will make sure that all tasks are completed before the pool exits.

          with WorkerPool(n_jobs=4) as pool:
              async_results = [pool.apply_async(task, args=(i, i)) for i in range(10)]
              pool.stop_and_join()

          # Will not deadlock
          results = [async_result.get() for async_result in async_results]

   AsyncResult
       The mpire.async_result.AsyncResult object has the following convenient methods:

          with WorkerPool(n_jobs=1) as pool:
              async_result = pool.apply_async(task, args=(1, 1))

              # Check if the task is completed
              is_completed = async_result.ready()

              # Wait until the task is completed, or until the timeout is reached.
              async_result.wait(timeout=10)

              # Get the result of the task. This will block until the task is completed,
              # or until the timeout is reached.
              result = async_result.get(timeout=None)

              # Check if the task was successful (i.e., did not raise an exception).
              # This will raise an exception if the task is not completed yet.
              is_successful = async_result.successful()

   Callbacks
       Each apply function has a callback and error_callback argument. These are functions which are called when
       the  task  is  finished.  The  callback  function is called with the result of the task when the task was
       completed successfully, and the error_callback is called with the exception when the task failed.

          def task(a):
              return a + 1

          def callback(result):
              print("Task completed successfully with result:", result)

          def error_callback(exception):
              print("Task failed with exception:", exception)

          with WorkerPool(n_jobs=1) as pool:
              pool.apply(task, 42, callback=callback, error_callback=error_callback)

   Worker init and exit
       As with the map family of functions, the apply family of functions also has worker_init  and  worker_exit
       arguments.  These  are functions which are called when a worker is started and stopped, respectively. See
       Worker init and exit for more information on these functions.

          def worker_init():
              print("Worker started")

          def worker_exit():
              print("Worker stopped")

          with WorkerPool(n_jobs=5) as pool:
              pool.apply(task, 42, worker_init=worker_init, worker_exit=worker_exit)

       There’s a caveat though. When the first apply or apply_async function is executed,  the  entire  pool  of
       workers is started. This means that in the above example all five workers are started, while only one was
       needed.  This  also  means that the worker_init function is set for all those workers at once. This means
       you cannot have a different worker_init function for each apply task.  A  second,  different  worker_init
       function will simply be ignored.

       Similarly,  the  worker_exit function can only be set once as well. Additionally, exit functions are only
       called when a worker exits, which in this case translates to when the pool exits. This means that if  you
       call  apply  or  apply_async multiple times, the worker_exit function is only called once at the end. Use
       mpire.WorkerPool.stop_and_join() to stop the workers, which will cause the  worker_exit  function  to  be
       triggered for each worker.

   Timeouts
       The  apply  family  of  functions  also  has  task_timeout,  worker_init_timeout  and worker_exit_timeout
       arguments. These are timeouts for the task,  the  worker_init  function  and  the  worker_exit  function,
       respectively.  They work similarly as those for the map functions.

       When  a  single task times out, only that task is cancelled. The other tasks will continue to run. When a
       worker init or exit times out, the entire pool is stopped.

       See Timeouts for more information.

   Dashboard
       The dashboard allows you to see progress information from a browser.  This  is  convenient  when  running
       scripts  in  a  notebook  or screen, if you want to share the progress information with others, or if you
       want to get real-time worker insight information.

       The dashboard dependencies are not installed by default. See Dashboard for more information.

   Contents
       • Starting the dashboard

       • Connecting to an existing dashboard

       • Using the dashboard

       • Stack level

   Starting the dashboard
       You can start the dashboard programmatically:

          from mpire.dashboard import start_dashboard

          # Will return a dictionary with dashboard details
          dashboard_details = start_dashboard()
          print(dashboard_details)

       which will print:

          {'dashboard_port_nr': 8080,
           'manager_host': 'localhost',
           'manager_port_nr': 8081}

       This will start a dashboard on your local machine on port 8080. When the port is  already  in  use  MPIRE
       will  try the next, until it finds an unused one. In the rare case that no ports are available up to port
       8099 the function will raise an OSError. By default, MPIRE tries ports 8080-8100. You can  override  this
       range by passing on a custom range object:

          dashboard_details = start_dashboard(range(9000, 9100))

       The  returned dictionary contains the port number that is ultimately chosen. It also contains information
       on how to connect to this dashboard remotely.

       Another way of starting a dashboard is by using the bash script (this doesn’t work on Windows!):

          $ mpire-dashboard

       This will start a dashboard with the connection details printed on screen. It will say something like:

          Starting MPIRE dashboard...

          MPIRE dashboard started on http://localhost:8080
          Server is listening on localhost:8098
          --------------------------------------------------

       The server part corresponds to the manager_host and  manager_port_nr  from  the  dictionary  returned  by
       mpire.dashboard.start_dashboard(). Similarly to earlier, a custom port range can be provided:

          $ mpire-dashboard --port-range 9000-9100

       The  benefit  of  starting a dashboard this way is that your dashboard keeps running in case of errors in
       your script. You will be able to see what the error was, when it occurred and where it occurred  in  your
       code.

   Connecting to an existing dashboard
       If you have started a dashboard elsewhere, you can connect to it using:

          from mpire.dashboard import connect_to_dashboard

          connect_to_dashboard(manager_port_nr=8081, manager_host='localhost')

       Make sure you use the manager_port_nr, not the dashboard_port_nr in the examples above.

       You  can  connect  to  an  existing dashboard on the same, but also on a remote machine (if the ports are
       open). If manager_host is omitted it will fall back to using 'localhost'.

   Using the dashboard
       Once connected to a dashboard you don’t need to change anything to your code. When you have  enabled  the
       use  of  a  progress  bar  in  your  map  call the progress bar will automatically register itself to the
       dashboard server and show up, like here:

          from mpire import WorkerPool
          from mpire.dashboard import connect_to_dashboard

          connect_to_dashboard(8099)

          def square(x):
              import time
              time.sleep(0.01)  # To be able to show progress
              return x * x

          with WorkerPool(4) as pool:
              pool.map(square, range(10000), progress_bar=True)

       This will show something like: [image: ] [image]

       You can click on a progress bar row to view details about the function that is called (which has  already
       been done in the screenshot above).

       It will let you know when a KeyboardInterrupt signal was send to the running process: [image: ] [image]

       or show the traceback information in case of an exception: [image: ] [image]

       In case you have enabled Worker insights these insights will be shown real-time in the dashboard: [image:
       ] [image]

       Click on the Insights (click to expand/collapse) to either expand or collapse the insight details.

       The dashboard will refresh automatically every 0.5 seconds.

   Stack level
       By  default, the dashboard will show information about the function that is called and where it is called
       from. However, in some cases where you have wrapped the function in another function, you might  be  less
       interested  in  the wrapper function and more interested in the function that is calling this wrapper. In
       such cases you can use mpire.dashboard.set_stacklevel() to set the stack level. This  is  the  number  of
       levels  in  the  stack  to go back in order to find the frame that contains the function that is invoking
       MPIRE. For example:

          from mpire import WorkerPool
          from mpire.dashboard import set_stacklevel, start_dashboard

          class WorkerPoolWrapper:
              def __init__(self, n_jobs, progress_bar=True):
                  self.n_jobs = n_jobs
                  self.progress_bar = progress_bar

              def __call__(self, func, data):
                  with WorkerPool(self.n_jobs) as pool:
                      return pool.map(func, data, progress_bar=self.progress_bar)

          def square(x):
              return x * x

          if __name__ == '__main__':
              start_dashboard()
              executor = WorkerPoolWrapper(4, progress_bar=True)
              set_stacklevel(1)  # default
              results = executor(square, range(10000))
              set_stacklevel(2)
              results = executor(square, range(10000))

       When you run this code you will see that the dashboard will show two progress bars. In  both  cases,  the
       dashboard  will  show  the square function as the function that is called. However, in the first case, it
       will show return pool.map(func, data, progress_bar=self.progress_bar) as the  line  where  it  is  called
       from. In the second case, it will show the results = executor(square, range(10000)) line.

   Troubleshooting
       This section describes some known problems that can arise when using MPIRE.

   Contents
       • Progress bar issues with Jupyter notebooks

         • IProgress not found

         • Widget Javascript not detected

       • Unit tests

       • Shutting down takes a long time on error

       • Unpicklable tasks/results

       • AttributeError: Can’t get attribute ‘<some_function>’ on <module ‘__main__’ (built-in)>

       • Windows

       • macOS

   Progress bar issues with Jupyter notebooks
       When  using  the  progress  bar in a Jupyter notebook you might encounter some issues. A few of these are
       described below, together with possible solutions.

   IProgress not found
       When you something like ImportError: IProgress not found. Please update  jupyter  and  ipywidgets.,  this
       means ipywidgets is not installed. You can install it using pip:

          pip install ipywidgets

       or conda:

          conda install -c conda-forge ipywidgets

       Have a look at the ipywidgets documentation for more information.

   Widget Javascript not detected
       When  you  see something like Widget Javascript not detected. It may not be enabled properly., this means
       the Javascript extension is not enabled. You can enable it using the following  command  before  starting
       your notebook:

          jupyter nbextension enable --py --sys-prefix widgetsnbextension

       Note  that  you  have to restart your notebook server after enabling the extension, simply restarting the
       kernel won’t be enough.

   Unit tests
       When using the 'spawn' or 'forkserver' method you’ll probably run into one or  two  issues  when  running
       unittests  in your own package. One problem that might occur is that your unittests will restart whenever
       the piece of code containing such a start method is called, leading to very  funky  terminal  output.  To
       remedy  this problem make sure your setup call in setup.py is surrounded by an if __name__ == '__main__':
       clause:

          from setuptools import setup

          if __name__ == '__main__':

              # Call setup and install any dependencies you have inside the if-clause
              setup(...)

       See the ‘Safe importing of main module’ section at caveats.

       The second problem you might encounter is that the semaphore tracker  of  multiprocessing  will  complain
       when  you  run individual (or a selection of) unittests using python setup.py test -s tests.some_test. At
       the end of the tests you will see errors like:

          Traceback (most recent call last):
            File ".../site-packages/multiprocess/semaphore_tracker.py", line 132, in main
              cache.remove(name)
          KeyError: b'/mp-d3i13qd5'
          .../site-packages/multiprocess/semaphore_tracker.py:146: UserWarning: semaphore_tracker: There appear to be 58
                                                                   leaked semaphores to clean up at shutdown
            len(cache))
          .../site-packages/multiprocess/semaphore_tracker.py:158: UserWarning: semaphore_tracker: '/mp-f45dt4d6': [Errno 2]
                                                                   No such file or directory
            warnings.warn('semaphore_tracker: %r: %s' % (name, e))
          ...

       Your unittests will still succeed and run OK. Unfortunately, I’ve not found  a  remedy  to  this  problem
       using python setup.py test yet. What you can use instead is something like the following:

          python -m unittest tests.some_test

       This will work just fine. See the unittest documentation for more information.

   Shutting down takes a long time on error
       When you issue a KeyboardInterrupt or when an error occured in the function that’s run in parallel, there
       are  situations where MPIRE needs a few seconds to gracefully shutdown. This has to do with the fact that
       in these situations the task or results queue can be quite full, still. MPIRE drains these  queues  until
       they’re completely empty, as to properly shutdown and clean up every communication channel.

       To  remedy  this  issue  you can use the max_tasks_active parameter and set it to n_jobs * 2, or similar.
       Aside from the added benefit that the workers can start more quickly, the  queues  won’t  get  that  full
       anymore and shutting down will be much quicker. See Maximum number of active tasks for more information.

       When  you’re  using a lazy map function also be sure to iterate through the results, otherwise that queue
       will be full and draining it will take a longer time.

   Unpicklable tasks/results
       Sometimes you can encounter deadlocks in your code when using MPIRE. When you encounter this, chances are
       some tasks or results from your script can’t be pickled. MPIRE makes use of  multiprocessing  queues  for
       inter-process communication and if your function returns unpicklable results the queue will unfortunately
       deadlock.

       The  only  way to remedy this problem in MPIRE would be to manually pickle objects before sending it to a
       queue and quit gracefully when encountering a pickle error. However, this would mean objects would always
       be pickled twice. This would add a heavy performance penalty and is therefore not an acceptable solution.

       Instead, the user should make sure their tasks and results are always  picklable  (which  in  most  cases
       won’t  be  a  problem),  or resort to setting use_dill=True. The latter is capable of pickling a lot more
       exotic types. See Dill for more information.

   AttributeError: Can’t get attribute ‘<some_function>’ on <module ‘__main__’ (built-in)>
       This error can occur when inside an iPython or Jupyter notebook session and the function  to  parallelize
       is  defined  in  that  session.  This  is often the result of using spawn as start method (the default on
       Windows), which starts a new process without copying the function in question.

       This error is actually related to the Unpicklable tasks/results problem and can be solved  in  a  similar
       way.  I.e.,  you can define your function in a file that can be imported by the child process, or you can
       resort to using dill by setting use_dill=True. See Dill for more information.

   Windows
       • When using dill and an exception occurs, or when the exception occurs in an exit function, it can print
         additional OSError messages in the terminal, but they can be safely ignored.

       • The mpire-dashboard script does not work on Windows.

   macOS
       • When encountering OSError: [Errno 24] Too many open files errors, use ulimit -n  <number>  to  increase
         the  limit  of  the  number  of  open  files. This is required because MPIRE uses file-descriptor based
         synchronization primitives and macOS has a very low default limit. For example, MPIRE  uses  about  190
         file descriptors when using 10 workers.

       • Pinning of processes to CPU cores is not supported on macOS. This is because macOS does not support the
         sched_setaffinity system call. A warning will be printed when trying to use this feature.

   API Reference
   Contents
       • WorkerPool

       • AsyncResult

       • Task chunking

       • Converting iterable of arguments

       • Dashboard

       • Other

   WorkerPool
       class mpire.WorkerPool(n_jobs=None, daemon=True, cpu_ids=None, shared_objects=None, pass_worker_id=False,
       use_worker_state=False, start_method='fork', keep_alive=False, use_dill=False, enable_insights=False,
       order_tasks=False)
              A  multiprocessing  worker pool which acts like a multiprocessing.Pool, but is faster and has more
              options.

              __enter__()
                     Enable the use of the with statement.

                     Return type
                            WorkerPool

              __exit__(*_)
                     Enable the use of the with statement. Gracefully terminates workers, if there are any

                     Return type
                            None

              __init__(n_jobs=None, daemon=True, cpu_ids=None, shared_objects=None, pass_worker_id=False,
              use_worker_state=False, start_method='fork', keep_alive=False, use_dill=False,
              enable_insights=False, order_tasks=False)

                     Parameters

                            • n_jobs  (Optional[int])  –  Number  of  workers  to  spawn.  If  None,  will   use
                              mpire.cpu_count()

                            • daemon (bool) – Whether to start the child processes as daemon

                            • cpu_ids  (List[Union[int,  List[int]]]) – List of CPU IDs to use for pinning child
                              processes to specific CPUs. The list must be as long as the number  of  jobs  used
                              (if  n_jobs  equals  None it must be equal to mpire.cpu_count()), or the list must
                              have exactly one element. In the former case, element i specifies the CPU ID(s) to
                              use for child process i. In the latter case the single element specifies  the  CPU
                              ID(s)  for  all  child  processes to use.  A single element can be either a single
                              integer specifying a single CPU ID, or a list of integers specifying that a single
                              child process can make use of multiple CPU IDs.  If  None,  CPU  pinning  will  be
                              disabled

                            • shared_objects  (Any)  –  Objects to be passed on as shared objects to the workers
                              once. It will be passed on to the target, worker_init, and worker_exit  functions.
                              shared_objects  is  only  passed  on  when  it’s  not None. Shared objects will be
                              copy-on-write when using fork as start method. When enabled, functions receive the
                              shared objects as second argument, depending on  other  settings.  The  order  is:
                              worker_id,  shared_objects, worker_state, and finally the arguments passed on from
                              iterable_of_args

                            • pass_worker_id (bool) – Whether to pass on a worker ID to the target, worker_init,
                              and worker_exit functions. When enabled, functions receive the worker ID as  first
                              argument,  depending  on  other settings. The order is: worker_id, shared_objects,
                              worker_state, and finally the arguments passed on from iterable_of_args

                            • use_worker_state (bool) – Whether to let a worker have a worker state. The  worker
                              state  will  be  passed  on to the target, worker_init, and worker_exit functions.
                              When enabled, functions receive the worker state as third argument,  depending  on
                              other  settings.  The  order  is:  worker_id,   shared_objects,  worker_state, and
                              finally the arguments passed on from iterable_of_args

                            • start_method  (str)  –  Which  process  start   method   to   use.   Options   for
                              multiprocessing:   'fork'   (default,  if  available),  'forkserver'  and  'spawn'
                              (default, if 'fork' isn’t available). For multithreading use  'threading'.  See  ‐
                              https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
                              for               more               information               and               ‐
                              https://docs.python.org/3/library/multiprocessing.html#the-spawn-and-forkserver-start-methods
                              for some caveats when using the 'spawn' or 'forkserver' methods

                            • keep_alive (bool) – When True it will keep workers alive after  completing  a  map
                              call, allowing to reuse workers

                            • use_dill  (bool) – Whether to use dill as serialization backend. Some exotic types
                              (e.g., lambdas, nested functions) don’t  work  well  when  using  spawn  as  start
                              method. In such cased, use dill (can be a bit slower sometimes)

                            • enable_insights  (bool) – Whether to enable worker insights. Might come at a small
                              performance penalty (often neglible)

                            • order_tasks (bool) – Whether to provide tasks to the workers in order,  such  that
                              worker 0 will get chunk 0, worker 1 will get chunk 1, etc.

              __weakref__
                     list of weak references to the object

              apply(func, args=(), kwargs=None, callback=None, error_callback=None, worker_init=None,
              worker_exit=None, task_timeout=None, worker_init_timeout=None, worker_exit_timeout=None)
                     Apply a function to a single task. This is a blocking call.

                     Parameters

                            • func (Callable) – Function to apply to the task. When passing on the worker ID the
                              function should receive the worker ID as its first argument. If shared objects are
                              provided  the  function  should  receive those as the next argument. If the worker
                              state has been enabled it should receive a state variable as the next argument

                            • args (Any) – Arguments to pass to a worker, which passes it to the  function  func
                              as func(*args)

                            • kwargs  (Dict)  –  Keyword  arguments  to pass to a worker, which passes it to the
                              function func as func(**kwargs)

                            • callback (Optional[Callable]) –  Callback  function  to  call  when  the  task  is
                              finished.  The  callback  function receives the output of the function func as its
                              argument

                            • error_callback (Optional[Callable]) – Callback function to call when the task  has
                              failed. The callback function receives the exception as its argument

                            • worker_init (Optional[Callable]) – Function to call each time a new worker starts.
                              When  passing  on  the  worker ID the function should receive the worker ID as its
                              first argument. If shared objects are provided the function should  receive  those
                              as  the  next  argument.  If the worker state has been enabled it should receive a
                              state variable as the next argument

                            • worker_exit (Optional[Callable]) – Function to call  each  time  a  worker  exits.
                              Return     values     will    be    fetched    and    made    available    through
                              mpire.WorkerPool.get_exit_results. When passing on  the  worker  ID  the  function
                              should receive the worker ID as its first argument. If shared objects are provided
                              the  function  should  receive those as the next argument. If the worker state has
                              been enabled it should receive a state variable as the next argument

                            • task_timeout (Optional[float]) – Timeout in seconds for a single  task.  When  the
                              timeout  is  exceeded,  MPIRE  will  raise  a  TimeoutError.  Use  None to disable
                              (default).  Note:  the  timeout  doesn’t  apply  to  worker_init  and  worker_exit
                              functions, use worker_init_timeout and worker_exit_timeout for that, respectively

                            • worker_init_timeout  (Optional[float])  –  Timeout  in seconds for the worker_init
                              function. When the timeout is exceeded, MPIRE will raise a TimeoutError. Use  None
                              to disable (default).

                            • worker_exit_timeout  (Optional[float])  –  Timeout  in seconds for the worker_exit
                              function. When the timeout is exceeded, MPIRE will raise a TimeoutError. Use  None
                              to disable (default).

                     Return type
                            Any

                     Returns
                            Result of the function func applied to the task

              apply_async(func, args=(), kwargs=None, callback=None, error_callback=None, worker_init=None,
              worker_exit=None, task_timeout=None, worker_init_timeout=None, worker_exit_timeout=None)
                     Apply a function to a single task. This is a non-blocking call.

                     Parameters

                            • func (Callable) – Function to apply to the task. When passing on the worker ID the
                              function should receive the worker ID as its first argument. If shared objects are
                              provided  the  function  should  receive those as the next argument. If the worker
                              state has been enabled it should receive a state variable as the next argument

                            • args (Any) – Arguments to pass to a worker, which passes it to the  function  func
                              as func(*args)

                            • kwargs  (Dict)  –  Keyword  arguments  to pass to a worker, which passes it to the
                              function func as func(**kwargs)

                            • callback (Optional[Callable]) –  Callback  function  to  call  when  the  task  is
                              finished.  The  callback  function receives the output of the function func as its
                              argument

                            • error_callback (Optional[Callable]) – Callback function to call when the task  has
                              failed. The callback function receives the exception as its argument

                            • worker_init (Optional[Callable]) – Function to call each time a new worker starts.
                              When  passing  on  the  worker ID the function should receive the worker ID as its
                              first argument. If shared objects are provided the function should  receive  those
                              as  the  next  argument.  If the worker state has been enabled it should receive a
                              state variable as the next argument

                            • worker_exit (Optional[Callable]) – Function to call  each  time  a  worker  exits.
                              Return     values     will    be    fetched    and    made    available    through
                              mpire.WorkerPool.get_exit_results. When passing on  the  worker  ID  the  function
                              should receive the worker ID as its first argument. If shared objects are provided
                              the  function  should  receive those as the next argument. If the worker state has
                              been enabled it should receive a state variable as the next argument

                            • task_timeout (Optional[float]) – Timeout in seconds for a single  task.  When  the
                              timeout  is  exceeded,  MPIRE  will  raise  a  TimeoutError.  Use  None to disable
                              (default).  Note:  the  timeout  doesn’t  apply  to  worker_init  and  worker_exit
                              functions, use worker_init_timeout and worker_exit_timeout for that, respectively

                            • worker_init_timeout  (Optional[float])  –  Timeout  in seconds for the worker_init
                              function. When the timeout is exceeded, MPIRE will raise a TimeoutError. Use  None
                              to disable (default).

                            • worker_exit_timeout  (Optional[float])  –  Timeout  in seconds for the worker_exit
                              function. When the timeout is exceeded, MPIRE will raise a TimeoutError. Use  None
                              to disable (default).

                     Return type
                            AsyncResult

                     Returns
                            Result of the function func applied to the task

              get_exit_results()
                     Obtain a list of exit results when an exit function is defined.

                     Return type
                            List

                     Returns
                            Exit results list

              get_insights()
                     Creates insights from the raw insight data

                     Return type
                            Dict

                     Returns
                            Dictionary containing worker insights

              imap(func, iterable_of_args, iterable_len=None, max_tasks_active=None, chunk_size=None,
              n_splits=None, worker_lifespan=None, progress_bar=False, worker_init=None, worker_exit=None,
              task_timeout=None, worker_init_timeout=None, worker_exit_timeout=None, progress_bar_options=None,
              progress_bar_style=None)
                     Same  as  multiprocessing.imap_unordered(),  but  ordered.  Also  allows  a user to set the
                     maximum number of tasks available in the queue.

                     Parameters

                            • func (Callable) – Function to call each time new task arguments become  available.
                              When  passing  on  the  worker ID the function should receive the worker ID as its
                              first argument. If shared objects are provided the function should  receive  those
                              as  the  next  argument.  If the worker state has been enabled it should receive a
                              state variable as the next argument

                            • iterable_of_args  (Union[Sized,  Iterable])  –  A  numpy  array  or  an   iterable
                              containing  tuples  of  arguments  to  pass  to  a  worker, which passes it to the
                              function func

                            • iterable_len (Optional[int]) – Number of elements in  the  iterable_of_args.  When
                              chunk_size is set to None it needs to know the number of tasks. This can either be
                              provided  by  implementing  the  __len__  function  on  the iterable object, or by
                              specifying the number of tasks

                            • max_tasks_active (Optional[int]) – Maximum number of active tasks in the queue. If
                              None it will be converted to n_jobs * chunk_size * 2

                            • chunk_size (Optional[int]) – Number of simultaneous tasks to  give  to  a  worker.
                              When None it will use n_splits.

                            • n_splits  (Optional[int])  – Number of splits to use when chunk_size is None. When
                              both chunk_size and n_splits are None, it will use n_splits = n_jobs * 64.

                            • worker_lifespan (Optional[int]) – Number of tasks a worker can handle before it is
                              restarted. If None, workers will stay alive the entire time. Use this when workers
                              use up too much memory over the course of time

                            • progress_bar (bool) – When True it will display a progress bar

                            • worker_init (Optional[Callable]) – Function to call each time a new worker starts.
                              When passing on the worker ID the function should receive the  worker  ID  as  its
                              first  argument.  If shared objects are provided the function should receive those
                              as the next argument. If the worker state has been enabled  it  should  receive  a
                              state variable as the next argument

                            • worker_exit  (Optional[Callable])  –  Function  to  call each time a worker exits.
                              Return    values    will    be    fetched    and    made     available     through
                              mpire.WorkerPool.get_exit_results.  When  passing  on  the  worker ID the function
                              should receive the worker ID as its first argument. If shared objects are provided
                              the function should receive those as the next argument. If the  worker  state  has
                              been enabled it should receive a state variable as the next argument

                            • task_timeout  (Optional[float])  –  Timeout in seconds for a single task. When the
                              timeout is exceeded,  MPIRE  will  raise  a  TimeoutError.  Use  None  to  disable
                              (default).  Note:  the  timeout  doesn’t  apply  to  worker_init  and  worker_exit
                              functions, use worker_init_timeout and worker_exit_timeout for that, respectively

                            • worker_init_timeout (Optional[float]) – Timeout in  seconds  for  the  worker_init
                              function.  When the timeout is exceeded, MPIRE will raise a TimeoutError. Use None
                              to disable (default).

                            • worker_exit_timeout (Optional[float]) – Timeout in  seconds  for  the  worker_exit
                              function.  When the timeout is exceeded, MPIRE will raise a TimeoutError. Use None
                              to disable (default).

                            • progress_bar_options (Optional[Dict[str, Any]]) –  Dictionary  containing  keyword
                              arguments  to  pass  to  the  tqdm  progress bar. See tqdm.tqdm() for details. The
                              arguments total and leave will be overwritten by MPIRE.

                            • progress_bar_style (Optional[str]) – The progress bar style to use. Can be one  of
                              None, 'std', or 'notebook'

                     Return type
                            Generator[Any, None, None]

                     Returns
                            Generator yielding ordered results

              imap_unordered(func, iterable_of_args, iterable_len=None, max_tasks_active=None, chunk_size=None,
              n_splits=None, worker_lifespan=None, progress_bar=False, worker_init=None, worker_exit=None,
              task_timeout=None, worker_init_timeout=None, worker_exit_timeout=None, progress_bar_options=None,
              progress_bar_style=None)
                     Same  as  multiprocessing.imap_unordered(). Also allows a user to set the maximum number of
                     tasks available in the queue.

                     Parameters

                            • func (Callable) – Function to call each time new task arguments become  available.
                              When  passing  on  the  worker ID the function should receive the worker ID as its
                              first argument. If shared objects are provided the function should  receive  those
                              as  the  next  argument.  If the worker state has been enabled it should receive a
                              state variable as the next argument

                            • iterable_of_args  (Union[Sized,  Iterable])  –  A  numpy  array  or  an   iterable
                              containing  tuples  of  arguments  to  pass  to  a  worker, which passes it to the
                              function func

                            • iterable_len (Optional[int]) – Number of elements in  the  iterable_of_args.  When
                              chunk_size is set to None it needs to know the number of tasks. This can either be
                              provided  by  implementing  the  __len__  function  on  the iterable object, or by
                              specifying the number of tasks

                            • max_tasks_active (Optional[int]) – Maximum number of active tasks in the queue. If
                              None it will be converted to n_jobs * chunk_size * 2

                            • chunk_size (Optional[int]) – Number of simultaneous tasks to  give  to  a  worker.
                              When None it will use n_splits.

                            • n_splits  (Optional[int])  – Number of splits to use when chunk_size is None. When
                              both chunk_size and n_splits are None, it will use n_splits = n_jobs * 64.

                            • worker_lifespan (Optional[int]) – Number of tasks a worker can handle before it is
                              restarted. If None, workers will stay alive the entire time. Use this when workers
                              use up too much memory over the course of time

                            • progress_bar (bool) – When True it will display a progress bar

                            • worker_init (Optional[Callable]) – Function to call each time a new worker starts.
                              When passing on the worker ID the function should receive the  worker  ID  as  its
                              first  argument.  If shared objects are provided the function should receive those
                              as the next argument. If the worker state has been enabled  it  should  receive  a
                              state variable as the next argument

                            • worker_exit  (Optional[Callable])  –  Function  to  call each time a worker exits.
                              Return    values    will    be    fetched    and    made     available     through
                              mpire.WorkerPool.get_exit_results.  When  passing  on  the  worker ID the function
                              should receive the worker ID as its first argument. If shared objects are provided
                              the function should receive those as the next argument. If the  worker  state  has
                              been enabled it should receive a state variable as the next argument

                            • task_timeout  (Optional[float])  –  Timeout in seconds for a single task. When the
                              timeout is exceeded,  MPIRE  will  raise  a  TimeoutError.  Use  None  to  disable
                              (default).  Note:  the  timeout  doesn’t  apply  to  worker_init  and  worker_exit
                              functions, use worker_init_timeout and worker_exit_timeout for that, respectively

                            • worker_init_timeout (Optional[float]) – Timeout in  seconds  for  the  worker_init
                              function.  When the timeout is exceeded, MPIRE will raise a TimeoutError. Use None
                              to disable (default).

                            • worker_exit_timeout (Optional[float]) – Timeout in  seconds  for  the  worker_exit
                              function.  When the timeout is exceeded, MPIRE will raise a TimeoutError. Use None
                              to disable (default).

                            • progress_bar_options (Optional[Dict[str, Any]]) –  Dictionary  containing  keyword
                              arguments  to  pass  to  the  tqdm  progress bar. See tqdm.tqdm() for details. The
                              arguments total and leave will be overwritten by MPIRE.

                            • progress_bar_style (Optional[str]) – The progress bar style to use. Can be one  of
                              None, 'std', or 'notebook'

                     Return type
                            Generator[Any, None, None]

                     Returns
                            Generator yielding unordered results

              join(keep_alive=False)
                     When  keep_alive=False:  inserts  a  poison  pill,  grabs the exit results, waits until the
                     tasks/results  queues  are  done,  and  waits  until  all  workers  are   finished.    When
                     keep_alive=True: inserts a non-lethal poison pill, and waits until the tasks/results queues
                     are done.

                     join``and ``stop_and_join are aliases.

                     Parameters
                            keep_alive (bool) – Whether to keep the workers alive

                     Return type
                            None

              map(func, iterable_of_args, iterable_len=None, max_tasks_active=None, chunk_size=None,
              n_splits=None, worker_lifespan=None, progress_bar=False, concatenate_numpy_output=True,
              worker_init=None, worker_exit=None, task_timeout=None, worker_init_timeout=None,
              worker_exit_timeout=None, progress_bar_options=None, progress_bar_style=None)
                     Same  as  multiprocessing.map().  Also  allows  a  user  to set the maximum number of tasks
                     available in the queue.  Note that this function can be slower than the unordered version.

                     Parameters

                            • func (Callable) – Function to call each time new task arguments become  available.
                              When  passing  on  the  worker ID the function should receive the worker ID as its
                              first argument. If shared objects are provided the function should  receive  those
                              as  the  next  argument.  If the worker state has been enabled it should receive a
                              state variable as the next argument

                            • iterable_of_args  (Union[Sized,  Iterable])  –  A  numpy  array  or  an   iterable
                              containing  tuples  of  arguments  to  pass  to  a  worker, which passes it to the
                              function func

                            • iterable_len (Optional[int]) – Number of elements in  the  iterable_of_args.  When
                              chunk_size is set to None it needs to know the number of tasks. This can either be
                              provided  by  implementing  the  __len__  function  on  the iterable object, or by
                              specifying the number of tasks

                            • max_tasks_active (Optional[int]) – Maximum number of active tasks in the queue. If
                              None it will be converted to n_jobs * chunk_size * 2

                            • chunk_size (Optional[int]) – Number of simultaneous tasks to  give  to  a  worker.
                              When None it will use n_splits.

                            • n_splits  (Optional[int])  – Number of splits to use when chunk_size is None. When
                              both chunk_size and n_splits are None, it will use n_splits = n_jobs * 64.

                            • worker_lifespan (Optional[int]) – Number of tasks a worker can handle before it is
                              restarted. If None, workers will stay alive the entire time. Use this when workers
                              use up too much memory over the course of time

                            • progress_bar (bool) – When True it will display a progress bar

                            • concatenate_numpy_output (bool) – When True it will concatenate numpy output to  a
                              single numpy array

                            • worker_init (Optional[Callable]) – Function to call each time a new worker starts.
                              When  passing  on  the  worker ID the function should receive the worker ID as its
                              first argument. If shared objects are provided the function should  receive  those
                              as  the  next  argument.  If the worker state has been enabled it should receive a
                              state variable as the next argument

                            • worker_exit (Optional[Callable]) – Function to call  each  time  a  worker  exits.
                              Return     values     will    be    fetched    and    made    available    through
                              mpire.WorkerPool.get_exit_results. When passing on  the  worker  ID  the  function
                              should receive the worker ID as its first argument. If shared objects are provided
                              the  function  should  receive those as the next argument. If the worker state has
                              been enabled it should receive a state variable as the next argument

                            • task_timeout (Optional[float]) – Timeout in seconds for a single  task.  When  the
                              timeout  is  exceeded,  MPIRE  will  raise  a  TimeoutError.  Use  None to disable
                              (default).  Note:  the  timeout  doesn’t  apply  to  worker_init  and  worker_exit
                              functions, use worker_init_timeout and worker_exit_timeout for that, respectively

                            • worker_init_timeout  (Optional[float])  –  Timeout  in seconds for the worker_init
                              function. When the timeout is exceeded, MPIRE will raise a TimeoutError. Use  None
                              to disable (default).

                            • worker_exit_timeout  (Optional[float])  –  Timeout  in seconds for the worker_exit
                              function. When the timeout is exceeded, MPIRE will raise a TimeoutError. Use  None
                              to disable (default).

                            • progress_bar_options  (Optional[Dict[str,  Any]])  – Dictionary containing keyword
                              arguments to pass to the tqdm progress  bar.  See  tqdm.tqdm()  for  details.  The
                              arguments total and leave will be overwritten by MPIRE.

                            • progress_bar_style  (Optional[str]) – The progress bar style to use. Can be one of
                              None, 'std', or 'notebook'

                     Return type
                            Any

                     Returns
                            List with ordered results

              map_unordered(func, iterable_of_args, iterable_len=None, max_tasks_active=None, chunk_size=None,
              n_splits=None, worker_lifespan=None, progress_bar=False, worker_init=None, worker_exit=None,
              task_timeout=None, worker_init_timeout=None, worker_exit_timeout=None, progress_bar_options=None,
              progress_bar_style=None)
                     Same as multiprocessing.map(), but unordered. Also allows a user to set the maximum  number
                     of tasks available in the queue.

                     Parameters

                            • func  (Callable) – Function to call each time new task arguments become available.
                              When passing on the worker ID the function should receive the  worker  ID  as  its
                              first  argument.  If shared objects are provided the function should receive those
                              as the next argument. If the worker state has been enabled  it  should  receive  a
                              state variable as the next argument

                            • iterable_of_args   (Union[Sized,  Iterable])  –  A  numpy  array  or  an  iterable
                              containing tuples of arguments to pass  to  a  worker,  which  passes  it  to  the
                              function func

                            • iterable_len  (Optional[int])  –  Number of elements in the iterable_of_args. When
                              chunk_size is set to None it needs to know the number of tasks. This can either be
                              provided by implementing the __len__  function  on  the  iterable  object,  or  by
                              specifying the number of tasks

                            • max_tasks_active (Optional[int]) – Maximum number of active tasks in the queue. If
                              None it will be converted to n_jobs * chunk_size * 2

                            • chunk_size  (Optional[int])  –  Number  of simultaneous tasks to give to a worker.
                              When None it will use n_splits.

                            • n_splits (Optional[int]) – Number of splits to use when chunk_size is  None.  When
                              both chunk_size and n_splits are None, it will use n_splits = n_jobs * 64.

                            • worker_lifespan (Optional[int]) – Number of tasks a worker can handle before it is
                              restarted. If None, workers will stay alive the entire time. Use this when workers
                              use up too much memory over the course of time

                            • progress_bar (bool) – When True it will display a progress bar

                            • worker_init (Optional[Callable]) – Function to call each time a new worker starts.
                              When  passing  on  the  worker ID the function should receive the worker ID as its
                              first argument. If shared objects are provided the function should  receive  those
                              as  the  next  argument.  If the worker state has been enabled it should receive a
                              state variable as the next argument

                            • worker_exit (Optional[Callable]) – Function to call  each  time  a  worker  exits.
                              Return     values     will    be    fetched    and    made    available    through
                              mpire.WorkerPool.get_exit_results. When passing on  the  worker  ID  the  function
                              should receive the worker ID as its first argument. If shared objects are provided
                              the  function  should  receive those as the next argument. If the worker state has
                              been enabled it should receive a state variable as the next argument

                            • task_timeout (Optional[float]) – Timeout in seconds for a single  task.  When  the
                              timeout  is  exceeded,  MPIRE  will  raise  a  TimeoutError.  Use  None to disable
                              (default).  Note:  the  timeout  doesn’t  apply  to  worker_init  and  worker_exit
                              functions, use worker_init_timeout and worker_exit_timeout for that, respectively

                            • worker_init_timeout  (Optional[float])  –  Timeout  in seconds for the worker_init
                              function. When the timeout is exceeded, MPIRE will raise a TimeoutError. Use  None
                              to disable (default).

                            • worker_exit_timeout  (Optional[float])  –  Timeout  in seconds for the worker_exit
                              function. When the timeout is exceeded, MPIRE will raise a TimeoutError. Use  None
                              to disable (default).

                            • progress_bar_options  (Optional[Dict[str,  Any]])  – Dictionary containing keyword
                              arguments to pass to the tqdm progress  bar.  See  tqdm.tqdm()  for  details.  The
                              arguments total and leave will be overwritten by MPIRE.

                            • progress_bar_style  (Optional[str]) – The progress bar style to use. Can be one of
                              None, 'std', or 'notebook'

                     Return type
                            Any

                     Returns
                            List with unordered results

              pass_on_worker_id(pass_on=True)
                     Set whether to pass on the worker ID to the  function  to  be  executed  or  not  (default=
                     False).

                     Parameters
                            pass_on  (bool)  –  Whether  to  pass on a worker ID to the target, worker_init, and
                            worker_exit functions. When enabled, functions receive the worker  ID  depending  on
                            other  settings.  The order is: worker_id, shared_objects, worker_state, and finally
                            the arguments passed on using iterable_of_args

                     Return type
                            None

              print_insights()
                     Prints insights per worker

                     Return type
                            None

              set_keep_alive(keep_alive=True)
                     Set whether workers should be kept alive in between consecutive map calls.

                     Parameters
                            keep_alive (bool) – When True it will keep workers  alive  after  completing  a  map
                            call, allowing to reuse workers

                     Return type
                            None

              set_order_tasks(order_tasks=True)
                     Set  whether to provide tasks to the workers in order, such that worker 0 will get chunk 0,
                     worker 1 will get chunk 1, etc.

                     Parameters
                            order_tasks (bool) – Whether to provide tasks to the workers  in  order,  such  that
                            worker 0 will get chunk 0, worker 1 will get chunk 1, etc.

                     Return type
                            None

              set_shared_objects(shared_objects=None)
                     Set shared objects to pass to the workers.

                     Parameters
                            shared_objects  (Any)  –  Objects  to  be passed on as shared objects to the workers
                            once. It will be passed on to the target, worker_init,  and  worker_exit  functions.
                            shared_objects  is  only  passed  on  when  it’s  not  None.  Shared objects will be
                            copy-on-write when using fork as start method. When enabled, functions  receive  the
                            shared objects depending on other settings. The order is: worker_id, shared_objects,
                            worker_state, and finally the arguments passed on using iterable_of_args`

                     Return type
                            None

              set_use_worker_state(use_worker_state=True)
                     Set  whether or not each worker should have its own state variable. Each worker has its own
                     state, so it’s not shared between the workers.

                     Parameters
                            use_worker_state (bool) – Whether to let a worker have a worker  state.  The  worker
                            state  will be passed on to the target, worker_init, and worker_exit functions. When
                            enabled, functions receive the worker state depending on other settings.  The  order
                            is:  worker_id,   shared_objects,  worker_state, and finally the arguments passed on
                            using iterable_of_args

                     Return type
                            None

              stop_and_join(keep_alive=False)
                     When keep_alive=False: inserts a poison pill, grabs  the  exit  results,  waits  until  the
                     tasks/results   queues   are  done,  and  waits  until  all  workers  are  finished.   When
                     keep_alive=True: inserts a non-lethal poison pill, and waits until the tasks/results queues
                     are done.

                     join``and ``stop_and_join are aliases.

                     Parameters
                            keep_alive (bool) – Whether to keep the workers alive

                     Return type
                            None

              terminate()
                     Tries to do a graceful shutdown of the workers, by interrupting them. In the case processes
                     deadlock it will send a sigkill.

                     Return type
                            None

   AsyncResult
       class mpire.async_result.AsyncResult(cache, callback, error_callback, job_id=None,
       delete_from_cache=True, timeout=None)
              Adapted from multiprocessing.pool.ApplyResult.

              __init__(cache, callback, error_callback, job_id=None, delete_from_cache=True, timeout=None)

                     Parameters

                            • cache (Dict) – Cache for storing intermediate results

                            • callback (Optional[Callable]) –  Callback  function  to  call  when  the  task  is
                              finished.  The  callback  function  receives  the  output  of  the function as its
                              argument

                            • error_callback (Optional[Callable]) – Callback function to call when the task  has
                              failed. The callback function receives the exception as its argument

                            • job_id (Optional[int]) – Job ID of the task. If None, a new job ID is generated

                            • delete_from_cache  (bool) – If True, the result is deleted from the cache when the
                              task is finished

                            • timeout (Optional[float]) – Timeout in seconds for a single task. When the timeout
                              is exceeded, MPIRE will raise a TimeoutError. Use None to disable (default)

              __weakref__
                     list of weak references to the object

              get(timeout=None)
                     Wait until the task is finished and return the output of the function

                     Parameters
                            timeout (Optional[float]) – Timeout in seconds. If None, wait indefinitely

                     Return type
                            Any

                     Returns
                            Output of the function

                     Raises TimeoutError if the task is not finished within  the  timeout.  When  the  task  has
                            failed, the exception raised by the function is re-raised

              ready()

                     Return type
                            bool

                     Returns
                            Returns True if the task is finished

              successful()

                     Return type
                            bool

                     Returns
                            Returns True if the task has finished successfully

                     Raises ValueError if the task is not finished yet

              wait(timeout=None)
                     Wait until the task is finished

                     Parameters
                            timeout (Optional[float]) – Timeout in seconds. If None, wait indefinitely

                     Return type
                            None

   Task chunking
       mpire.utils.chunk_tasks(iterable_of_args, iterable_len=None, chunk_size=None, n_splits=None)
              Chunks  tasks  such  that  individual  workers will receive chunks of tasks rather than individual
              ones, which can speed up processing drastically.

              Parameters

                     • iterable_of_args (Iterable) – A numpy array or an iterable containing tuples of arguments
                       to pass to a worker, which passes it to the function

                     • iterable_len (Optional[int]) – Number of tasks available in iterable_of_args. Only needed
                       when iterable_of_args is a generator

                     • chunk_size (Union[int, float, None]) – Number of simultaneous tasks to give to a  worker.
                       If None, will use n_splits to determine the chunk size

                     • n_splits (Optional[int]) – Number of splits to use when chunk_size is None

              Return type
                     Generator[Collection, None, None]

              Returns
                     Generator of chunked task arguments

   Converting iterable of arguments
       mpire.utils.make_single_arguments(iterable_of_args, generator=True)
              Converts an iterable of single arguments to an iterable of single argument tuples

              Parameters

                     • iterable_of_args (Iterable) – A numpy array or an iterable containing tuples of arguments
                       to pass to a worker, which passes it to the function

                     • generator  (bool)  –  Whether or not to return a generator, otherwise a materialized list
                       will be returned

              Return type
                     Union[List, Generator]

              Returns
                     Iterable of single argument tuples

   Dashboard
       mpire.dashboard.start_dashboard(port_range=range(8080, 8100))
              Starts a new MPIRE dashboard

              Parameters
                     port_range (Sequence) – Port range to try.

              Return type
                     Dict[str, Union[str, int]]

              Returns
                     A dictionary containing the dashboard port number and manager host and  port  number  being
                     used

       mpire.dashboard.connect_to_dashboard(manager_port_nr, manager_host=None)
              Connects to an existing MPIRE dashboard

              Parameters

                     • manager_port_nr (int) – Port to use when connecting to a manager

                     • manager_host  (Union[bytes,  str,  None])  – Host to use when connecting to a manager. If
                       None it will use localhost

              Return type
                     None

       mpire.dashboard.shutdown_dashboard()
              Shuts down the dashboard

              Return type
                     None

       mpire.dashboard.get_stacklevel()
              Gets the stack level to use when obtaining function details (used for the dashboard)

              Return type
                     int

              Returns
                     Stack level

       mpire.dashboard.set_stacklevel(stacklevel)
              Sets the stack level to use when obtaining function details (used for the dashboard)

              Parameters
                     stacklevel (int) – Stack level

              Return type
                     None

   Other
       mpire.cpu_count()
              Returns the number of CPUs in the system

   Contribution guidelines
       If you want to contribute to MPIRE, great! Please follow the steps below to ensure a smooth process:

       1. Clone the project.

       2. Create a new branch for your feature or bug fix. Give you branch a meaningful name.

       3. Make your feature addition or bug fix.

       4. Add tests for it and test it yourself. Make sure it both works for Unix and Windows based systems,  or
          make sure to document why it doesn’t work for one of the platforms.

       5. Add documentation for it. Don’t forget about the changelog:

          • Reference  the  issue  number from GitHub in the changelog, if applicable (see current changelog for
            examples).

          • Don’t mention a date or a version number here, but use Unreleased instead.

       6. Commit with a meaningful commit message (e.g. the changelog).

       7. Open a pull request.

       8. Resolve any issues or comments by the reviewer.

       9. Merge PR by squashing all your individual commits.

   Making a release
       A release is only made by the project maintainer. The following steps are required:

       1. Update the changelog with the release date and version number. Version  numbers  follow  the  Semantic
          Versioning guidelines

       2. Update the version number in setup.py and docs/conf.py.

       3. Commit and push the changes.

       4. Make sure the tests pass on GitHub Actions.

       5. Create a tag for the release by using git tag -a vX.Y.Z -m "vX.Y.Z".

       6. Push the tag to GitHub by using git push origin vX.Y.Z.

   Changelog
   2.10.2
       (2024-05-07)

       • Function details in progress_bar.py are only obtained when the dashboard is running (#128)

       • Obtaining  the  user name is now put in a try-except block to prevent MPIRE from crashing when the user
         name cannot be obtained. which can happen when running in a container as a non-root user (#128)

   2.10.1
       (2024-03-19)

       • Fixed a bug in the timeout handler where the cache dictionary could be changed during iteration (#123)

       • Fixed an authentication error when using a progress bar or insights in a spawn  or  forkserver  context
         when using dill (#124)

   2.10.0
       (2024-02-19)

       • Added support for macOS (#27, #79, #91)

         • Fixes memory leaks on macOS

         • Reduced the amount of semaphores used

         • Issues a warning when cpu_ids is used on macOS

       • Added  mpire.dashboard.set_stacklevel()  to  set the stack level in the dashboard. This influences what
         line to display in the ‘Invoked on line’ section. (#118)

       • Use function details from the __call__ method on the dashboard in case the callable being executed is a
         class instance (#117)

       • Use (global) average rate for the estimate on the dashboard when smoothing=0 (#117)

       • Make it possible to reuse the same progress_bar_options without raising warnings (#117)

       • Removed    deprecated    progress_bar_position    parameter    from    the    map    functions.     Use
         progress_bar_options[‘position’] instead (added since v2.6.0)

   2.9.0
       (2024-01-08)

       • Added support for the rich progress bar style (#96)

       • Added the option to only show progress on the dashboard. (#107)

       • Progress bars are now supported on Windows when using threading as start method.

       • Insights now also work when using the forkserver and spawn start methods. (#104)

       • When using insights on Windows the arguments of the top 5 longest tasks are now available as well.

       • Fixed deprecated escape import from flask by importing directly from markupsafe. (#106)

       • Fixed mpire.dashboard.start_dashboard() freeze when there are no two ports available. (#112)

       • Added mpire.dashboard.shutdown_dashboard() to shutdown the dashboard.

       • Added py.typed file to prompt mypy for type checking. (#108)

   2.8.1
       (2023-11-08)

       • Excluded the tests folder from MPIRE distributions (#89)

       • Added  a workaround for semaphore leakage on macOS and fixed a bug when working in a fork context while
         the system default is spawn (#92)

       • Fix progressbar percentage on dashboard (#101)

       • Fixed a bug where starting multiple apply_async tasks with a task timeout didn’t  interrupt  all  tasks
         when the timeout was reached (#98)

       • Add testing python 3.12 to workflow and drop 3.6 and 3.7 (#102)

   2.8.0
       (2023-08-16)

       • Added support for Python 3.11 (#67)

   2.7.1
       (2023-04-14)

       • Transfered ownership of the project from Slimmer AI to sybrenjansen

   2.7.0
       (2023-03-17)

       • Added the mpire.WorkerPool.apply() and mpire.WorkerPool.apply_async() functions (#63)

       • When  inside  a  Jupyter  notebook, the progress bar will not automatically switch to a widget anymore.
         tqdm cannot always determine with certainty that someone is in a notebook or, e.g., a Jupyter  console.
         Another  reason  is to avoid the many errors people get when having widgets or javascript disabled. See
         Progress bar style for changing the progress bar to a widget (#71)

       • The mpire.dashboard.connect_to_dashboard() function now  raises  a  ConnectionRefused  error  when  the
         dashboard  isn’t running, instead of silently failing and deadlocking the next map call with a progress
         bar (#68)

       • Added support for a progress bar without knowing the size of the  iterable.  It  used  to  disable  the
         progress bar when the size was unknown

       • Changed how max_tasks_active is handled. It now applies to the number of tasks that are currently being
         processed,  instead  of  the  number of chunks of tasks, as you would expect from the name. Previously,
         when the chunk size was set to anything other than 1, the number of active tasks could be  higher  than
         max_tasks_active

       • Updated some exception messages and docs (#69)

       • Changed how worker results, restarts, timeouts, unexpected deaths, and exceptions are handled. They are
         now handled by individual threads such that the main thread is more responsive. The API is the same, so
         no user changes are needed

       • Mixing multiple map calls now raises an error (see Mixing map functions)

       • Fixed a bug where calling a map function with a progress bar multiple times in a row didn’t display the
         progress bar correctly

       • Fixed a bug where the dashboard didn’t show an error when an exit function raised an exception

   2.6.0
       (2022-08-29)

       • Added Python 3.10 support

       • The  tqdm  progress  bar  can  now  be  customized  using the progress_bar_options parameter in the map
         functions (#57)

       • Using progress_bar_position from a map function is now deprecated and will be removed in MPIRE v2.10.0.
         Use progress_bar_options['position'] instead

       • Deprecated enable_insights from a map function,  use  enable_insights  in  the  WorkerPool  constructor
         instead

       • Fixed  a  bug where a worker could exit before an exception was entirely sent over the queue, causing a
         deadlock (#56)

       • Fixed a bug where exceptions with init arguments weren’t handled correctly (#58)

       • Fixed a rare and weird bug in Windows that could cause a deadlock (probably fixes #55)

   2.5.0
       (2022-07-25)

       • Added the option to fix the order of tasks given to the workers (#46)

       • Fixed a bug where updated WorkerPool parameters aren’t used in subsequent map calls when keep_alive  is
         enabled

   2.4.0
       (2022-05-25)

       • A  timeout for the target, worker_init, and worker_exit functions can be specified after which a worker
         is stopped (#36)

       • A WorkerPool can now be started within a thread which isn’t the main thread (#44)

   2.3.5
       (2022-04-25)

       • MPIRE now handles defunct child processes properly, instead of deadlocking (#34)

       • Added benchmark highlights to README (#38)

   2.3.4
       (2022-03-29)

       • Platform specific dependencies are now handled using environment markers as defined in PEP-508 (#30)

       • Fixes hanging WorkerPool when using worker_lifespan and returning results that exceed the pipe capacity
         (#32)

       • Fixes insights unit tests that could sometime fail because it was too fast

   2.3.3
       (2021-11-29)

       • Changed progress bar handler process to thread, making it more stable (especially in notebooks)

       • Changed progress bar tasks completed queue to array, to make it more responsive and faster

       • Disabled the tqdm monitor thread which, in combination with MPIRE’s own  tqdm  lock,  could  result  in
         deadlocks

   2.3.2
       (2021-11-19)

       • Included license file in source distribution (#25)

   2.3.1
       (2021-11-16)

       • Made connecting to the tqdm manager more robust (#23)

   2.3.0
       (2021-10-15)

       • Fixed progress bar in a particular setting with iPython and django installed (#13)

       • keep_alive  now  works  even  when  the  function to be called or any other parameter passed to the map
         function is changed (#15)

       • Moved enable_insights to the WorkerPool constructor. Using enable_insights from a map function  is  now
         deprecated and will be removed in MPIRE v2.6.0.

       • Restructured docs and updated several sections for Windows users.

   2.2.1
       (2021-08-31)

       • Fixed compatibility with newer tqdm versions (>= 4.62.2) (#11)

   2.2.0
       (2021-08-30)

       • Added support for Windows (#6, #7). Support has a few caveats:

         • When using worker insights the arguments of the top 5 longest tasks are not available

         • Progress bar is not supported when using threading as start method

         • When  using  dill  and  an exception occurs, or when the exception occurs in an exit function, it can
           print additional OSError messages in the terminal, but these can be safely ignored.

   2.1.1
       (2021-08-26)

       • Fixed a bug with newer versions of tqdm. The progress bar would throw an AttributeError when  connected
         to a dashboard.

       • README and documentation updated

   2.1.0
       (2021-08-06)

       • Workers now have their own task queue, which speeds up tasks with bigger payloads

       • Fixed progress bar showing error information when completed without error

       • Fixed progress bar and worker insights not displaying properly when using threading

       • Progress bar handling improved accross several scenarios

       • Dashboard can now handle progress bars when using spawn or forkserver as start method

       • Added closing of multiprocessing.JoinableQueue objects, to clean up intermediate junk

       • Removed numpy dependency

       • Made dill optional again. In many cases it slows processing down

   2.0.0
       (2021-07-07)

       • Worker insights added, providing users insight in multiprocessing efficiency

       • worker_init and worker_exit parameters added to each map function

       • max_active_tasks is now set to n_jobs * 2 when max_active_tasks=None, to speed up most jobs

       • n_splits is now set to n_jobs * 64 when both chunk_size and n_splits are None

       • Dashboard ports can now be configured

       • Renamed func_pointer to func in each map function

       • Fixed a bug with the threading backend not terminating correctly

       • Fixed a bug with the progress bar not showing correctly in notebooks

       • Using multiprocess is now the default

       • Added some debug logging

       • Refactored a lot of code

       • Minor bug fixes, which should make things more stable.

       • Removed Python 3.5 support

       • Removed   add_task,   get_result,   insert_poison_pill,   stop_workers,   and   join   functions   from
         mpire.WorkerPool. Made start_workers private.  There wasn’t any reason to use these functions.

   1.2.2
       (2021-04-23)

       • Updated documentation CSS which fixes bullet lists not showing properly

   1.2.1
       (2021-04-22)

       • Updated some unittests and fixed some linting issues

       • Minor improvements in documentation

   1.2.0
       (2021-04-22)

       • Workers can be kept alive in between consecutive map calls

       • Setting CPU affinity is no longer restricted to Linux platforms

       • README updated to use RST format for better compatibility with PyPI

       • Added classifiers to the setup file

   1.1.3
       (2020-09-03)

       • First public release on Github and PyPi

   1.1.2
       (2020-08-27)

       • Added missing typing information

       • Updated some docstrings

       • Added license

   1.1.1
       (2020-02-19)

       • Changed collections.Iterable to collections.abc.Iterable due to deprecation of the former

   1.1.0
       (2019-10-31)

       • Removed custom progress bar support to fix Jupyter notebook support

       • New progress_bar_position parameter is now available to set the position of the progress bar when using
         nested worker pools

       • Screen resizing is now supported when using a progress bar

   1.0.0
       (2019-10-29)

       • Added the MPIRE dashboard

       • Added threading as a possible backend

       • Progress bar handling now occurs in a separate process, instead of a thread, to improve responsiveness

       • Refactoring of code and small bug fixes in error handling

       • Removed deprecated functionality

   0.9.0
       (2019-03-11)

       • Added support for using different start methods (‘spawn’ and ‘forkserver’) instead of only the  default
         method ‘fork’

       • Added optional support for using dill in multiprocessing by utilizing the multiprocess library

       • The mpire.Worker class is no longer directly available

   0.8.1
       (2019-02-06)

       • Fixed bug when process would hang when progress bar was set to True and an empty iterable was provided

   0.8.0
       (2018-11-01)

       • Added support for worker state

       • Chunking numpy arrays is now done using numpy slicing

       • mpire.WorkerPool.map() now supports automatic concatenation of numpy array output

   0.7.2
       (2018-06-14)

       • Small bug fix when not passing on a boolean or tqdm object for the progress_bar parameter

   0.7.1
       (2017-12-20)

       • You  can  now  pass  on  a  dictionary  as  an  argument  which  will be unpacked accordingly using the
         **-operator.

       • New function mpire.utils.make_single_arguments() added which allows you to create an iterable of single
         argument tuples out of an iterable of single arguments

   0.7.0
       (2017-12-11)

       • mpire.utils.chunk_tasks() is now available as a public function

       • Chunking in above function and map functions now accept a n_splits parameter

       • iterable_of_args in map functions can now contain single values instead of only iterables

       • tqdm is now available from the MPIRE  package  which  automatically  switches  to  the  Jupyter/IPython
         notebook widget when available

       • Small bugfix in cleaning up a worker pool when no map function was called

   0.6.2
       (2017-11-07)

       • Fixed a second bug where the main process could get unresponsive when an exception was raised

   0.6.1
       (2017-11-06)

       • Fixed bug where sometimes exceptions fail to pickle

       • Fixed a bug where the main process could get unresponsive when an exception was raised

       • Child processes are now cleaned up in parallel when an exception was raised

   0.6.0
       (2017-11-03)

       • restart_workers parameter is now deprecated and will be removed from v1.0.0

       • Progress bar functionality added (using tqdm)

       • Improved error handling in user provided functions

       • Fixed randomly occurring BrokenPipeErrors and deadlocks

   0.5.1
       (2017-10-12)

       • Child  processes  can now also be pinned to a range of CPUs, instead of only a single one. You can also
         specify a single CPU or range of CPUs that have to be shared between all child processes

   0.5.0
       (2017-10-06)

       • Added CPU pinning.

       • Default number of processes to spawn when using n_jobs=None is now set to the number of CPUs available,
         instead of cpu_count() - 1

   0.4.0
       (2017-10-05)

       • Workers can now be started as normal child processes (non-deamon) such that nested  mpire.WorkerPool  s
         are possible

   0.3.0
       (2017-09-15)

       • The   worker   ID   can   now   be   passed   on   the   function   to   be   executed   by  using  the
         mpire.WorkerPool.pass_on_worker_id() function

       • Removed       the       use       of       has_return_value_with_shared_objects       when        using
         mpire.WorkerPool.set_shared_objects().  MPIRE now handles both cases out of the box

   0.2.0
       (2017-06-27)

       • Added docs

   0.1.0
       First release

AUTHOR

       Sybren Jansen

COPYRIGHT

       2025, Sybren Jansen

2.10.2                                            Apr 14, 2025                                          MPIRE(1)

NAME

FEATURES

CONTENTS

AUTHOR

COPYRIGHT