lunar (1) datalad-create-sibling-ria.1.gz

Provided by: datalad_0.18.1-2_all bug

NAME

       datalad create-sibling-ria - creates a sibling to a dataset in a RIA store

SYNOPSIS

       datalad create-sibling-ria [-h] -s NAME [-d DATASET] [--storage-name NAME] [--alias ALIAS]
              [--post-update-hook]  [--shared  {false|true|umask|group|all|world|everybody|0xxx}]
              [--group   GROUP]   [--storage-sibling  MODE]  [--existing  MODE]  [--new-store-ok]
              [--trust-level TRUST-LEVEL] [-r]  [-R  LEVELS]  [--no-storage-sibling]  [--push-url
              ria+<ssh|file>://<host>[/path]] [--version] ria+<ssh|file|https>://<host>[/path]

DESCRIPTION

       Communication with a dataset in a RIA store is implemented via two siblings. A regular Git
       remote (repository sibling) and a git-annex special  remote  for  data  transfer  (storage
       sibling) -- with the former having a publication dependency on the latter. By default, the
       name of the storage sibling is derived from the repository  sibling's  name  by  appending
       "-storage".

       The  store's  base  path  is  expected to not exist, be an empty directory, or a valid RIA
       store.

       RIA URL format ~~~~~~~~~~~~~~

       Interactions with new or existing RIA stores require RIA URLs to  identify  the  store  or
       specific datasets inside of it.

       The   general   structure   of   a   RIA   URL   pointing   to  a  store  takes  the  form
       ria+ssh://[user@]hostname:/absolute/path/to/ria-store,                                  or
       ria+file:///absolute/path/to/ria-store)

       The  general  structure  of  a  RIA  URL pointing to a dataset in a store (for example for
       cloning) takes a similar form, but appends either the datasets UUID or a ~ symbol followed
       by  the dataset's alias name: In addition, specific version identifiers can be appended to
       the URL with an additional @ symbol:

       RIA store layout ~~~~~~~~~~~~~~~~

       A RIA store is a directory tree with a dedicated subdirectory  for  each  dataset  in  the
       store.   The   subdirectory  name  is  constructed  from  the  DataLad  dataset  ID,  e.g.
       '124/68afe-59ec-11ea-93d7-f0d5bf7b5561', where the first three characters of  the  ID  are
       used  for  an  intermediate subdirectory in order to mitigate files system limitations for
       stores containing a large number of datasets.

       By default, a dataset in a RIA store consists of two components: A Git repository (for all
       dataset  contents  stored  in  Git)  and  a storage sibling (for dataset content stored in
       git-annex).

       It is possible to selectively disable either component using ``storage-sibling 'off'``  or
       ``storage-sibling  'only'``,  respectively.  If neither component is disabled, a dataset's
       subdirectory layout in a RIA store contains a standard bare Git repository and an 'annex/'
       subdirectory  inside  of  it.  The latter holds a Git-annex object store and comprises the
       storage sibling.  Disabling the standard git-remote ('storage-sibling=only')  will  result
       in    not    having   the   bare   git   repository,   disabling   the   storage   sibling
       ('storage-sibling=off') will result in not having the 'annex/' subdirectory.

       Optionally, there can be a further subdirectory 'archives' with (compressed)  7z  archives
       of annex objects. The storage remote is able to pull annex objects from these archives, if
       it cannot find in the regular annex object store. This feature can be useful  for  storing
       large  collections  of rarely changing data on systems that limit the number of files that
       can be stored.

       Each dataset directory also contains a 'ria-layout-version' file that identifies the  data
       organization (as, for example, described above).

       Lastly,  there  is  a  global  'ria-layout-version'  file  at  the  store's base path that
       identifies where dataset subdirectories themselves are located. At present, this file must
       contain  a  single  line  stating  the  version (currently "1"). This line MUST end with a
       newline character.

       It is possible to define an alias for an individual  dataset  in  a  store  by  placing  a
       symlink  to the dataset location into an 'alias/' directory in the root of the store. This
       enables dataset access via URLs of format:

       Compared to standard git-annex object stores, the 'annex/' subdirectories used as  storage
       siblings   follow   a   different   layout   naming   scheme  ('dirhashmixed'  instead  of
       'dirhashlower').  This is mostly noted as a technical detail, but also  serves  to  remind
       git-annex  powerusers  to  refrain from running git-annex commands directly in-store as it
       can cause severe damage due to the layout difference. Interactions should be  handled  via
       the ORA special remote instead.

       Error logging ~~~~~~~~~~~~~

       To  enable error logging at the remote end, append a pipe symbol and an "l" to the version
       number in ria-layout-version (like so '1|l0).

       Error logging will create files in an "error_log" directory whenever the git-annex special
       remote  (storage  sibling)  raises  an  exception, storing the Python traceback of it. The
       logfiles are named according to the scheme issue with which dataset. Because  logging  can
       potentially  leak  personal  data  (like local file paths for example), it can be disabled
       client-side         by         setting         the         configuration          variable
       "annex.ora-remote.<storage-sibling-name>.ignore-remote-config".

OPTIONS

       ria+<ssh|file|http(s)>://<host>[/path]
              URL  identifying  the  target  RIA  store and access protocol. If ``--push-url`` is
              given in addition, this is used for read access only. Otherwise it will be used for
              write  access too and to create the repository sibling in the RIA store. Note, that
              HTTP(S) currently is valid for consumption only thus requiring to provide ``--push-
              url``. Constraints: value must be a string or value must be NONE

       -h, --help, --help-np
              show  this  help  message.  --help-np  forcefully  disables  the use of a pager for
              displaying the help message

       -s NAME, --name NAME
              Name of the sibling. With RECURSIVE, the same name will be used to  label  all  the
              subdatasets' siblings. Constraints: value must be a string or value must be NONE

       -d DATASET, --dataset DATASET
              specify  the  dataset  to  process.  If  no dataset is given, an attempt is made to
              identify the dataset based on the current  working  directory.  Constraints:  Value
              must be a Dataset or a valid identifier of a Dataset (e.g. a path) or value must be
              NONE

       --storage-name NAME
              Name of the storage sibling (git-annex special remote). Must not  be  identical  to
              the  sibling  name.  If not specified, defaults to the sibling name plus '-storage'
              suffix. If only a storage sibling is created, this  setting  is  ignored,  and  the
              primary  sibling name is used. Constraints: value must be a string or value must be
              NONE

       --alias ALIAS
              Alias for the dataset in the RIA store. Add the  necessary  symlink  so  that  this
              dataset  can  be cloned from the RIA store using the given ALIAS instead of its ID.
              With `recursive=True`, only the top dataset will  be  aliased.  Constraints:  value
              must be a string or value must be NONE

       --post-update-hook
              Enable  Git's default post-update-hook for the created sibling. This is useful when
              the sibling is made accessible via a  "dumb  server"  that  requires  running  'git
              update-server-info' to let Git interact properly with it.

       --shared {false|true|umask|group|all|world|everybody|0xxx}
              If  given,  configures  the  permissions  in  the RIA store for multi-users access.
              Possible values for this option are identical to those of `git init  --shared`  and
              are  described  in  its documentation. Constraints: value must be a string or value
              must be convertible to type bool or value must be NONE

       --group GROUP
              Filesystem  group  for  the  repository.  Specifying  the  group  is  crucial  when
              --shared=group. Constraints: value must be a string or value must be NONE

       --storage-sibling MODE
              By  default,  an ORA storage sibling and a Git repository sibling are created (on).
              Alternatively, creation of the storage sibling can be disabled (off), or a  storage
              sibling  created  only  and  no  Git  sibling  (only).  In  the latter mode, no Git
              installation is required on the target host. Constraints:  value  must  be  one  of
              ('only',) or value must be convertible to type bool or value must be NONE [Default:
              True]

       --existing MODE
              Action to perform, if a (storage) sibling is already  configured  under  the  given
              name  and/or  a  target  already  exists.  In  this  case, a dataset can be skipped
              ('skip'), an existing target  repository  be  forcefully  re-initialized,  and  the
              sibling  (re-)configured  ('reconfigure'),  or  the  command  be instructed to fail
              ('error'). Constraints: value must  be  one  of  ('skip',  'error',  'reconfigure')
              [Default: 'error']

       --new-store-ok
              When set, a new store will be created, if necessary. Otherwise, a sibling will only
              be created if the url points to an existing RIA store.

       --trust-level TRUST-LEVEL
              specify a trust level for the storage sibling. If not specified, the  default  git-
              annex  trust  level  is  used. 'trust' should be used with care (see the git-annex-
              trust  man  page).  Constraints:  value  must  be  one  of  ('trust',  'semitrust',
              'untrust')

       -r, --recursive
              if set, recurse into potential subdatasets.

       -R LEVELS, --recursion-limit LEVELS
              limit  recursion into subdatasets to the given number of levels. Constraints: value
              must be convertible to type 'int' or value must be NONE

       --no-storage-sibling
              This option is deprecated. Use '--storage-sibling off' instead.

       --push-url ria+<ssh|file>://<host>[/path]
              URL identifying the target RIA store and access protocol for write  access  to  the
              storage  sibling.  If  given  this will also be used for creation of the repository
              sibling in the RIA store. Constraints: value must be a string or value must be NONE

       --version
              show the module and its version which provides the command

AUTHORS

        datalad is developed by The DataLad Team and Contributors <team@datalad.org>.