Ubuntu Manpage: datalad create-sibling-gitlab - create dataset sibling at a GitLab site

NAME

       datalad create-sibling-gitlab - create dataset sibling at a GitLab site

SYNOPSIS


       datalad      create-sibling-gitlab      [-h]      [--site     SITENAME]     [--project     NAME/LOCATION]
              [--layout   {collection|flat}]    [--dataset    DATASET]    [-r]    [-R    LEVELS]    [-s    NAME]
              [--existing      {skip|error|reconfigure}]      [--access     {http|ssh|ssh+http}]     [--publish-
              depends SIBLINGNAME] [--description DESCRIPTION] [--dryrun] [--dry-run] [--version] [PATH ...]

DESCRIPTION

An existing GitLab project, or a project created via the GitLab web interface can be configured as a sib‐
ling with the siblings command. Alternatively, this command can create a GitLab project at any loca‐
tion/path a given user has appropriate permissions for. This is particularly helpful for recursive sib‐
ling creation for subdatasets. API access and authentication are implemented via python-gitlab, and all
its features are supported. A particular GitLab site must be configured in a named section of a
python-gitlab.cfg file (see https://python-gitlab.readthedocs.io/en/stable/cli-usage.html#configura‐
tion-file-format for details), such as::

[mygit] url = https://git.example.com api_version = 4 private_token = abcdefghijklmnopqrst

Subsequently, this site is identified by its name ('mygit' in the example above).

(Recursive) sibling creation for all, or a selected subset of subdatasets is supported with two different
project layouts (see --layout):

"flat" All datasets are placed as GitLab projects in the same group. The project name of the top-level
dataset follows the configured datalad.gitlab-SITENAME-project configuration. The project names of con‐
tained subdatasets extend the configured name with the subdatasets' s relative path within the root
dataset, with all path separator characters replaced by '-'. This path separator is configurable (see
Configuration). "collection" A new group is created for the dataset hierarchy, following the datal‐
ad.gitlab-SITENAME-project configuration. The root dataset is placed in a "project" project inside this
group, and all nested subdatasets are represented inside the group using a "flat" layout. The root
datasets project name is configurable (see Configuration).

GitLab cannot host dataset content. However, in combination with other data sources (and siblings), pub‐
lishing a dataset to GitLab can facilitate distribution and exchange, while still allowing any dataset
consumer to obtain actual data content from alternative sources.

Configuration
Many configuration switches and options for GitLab sibling creation can be provided as arguments to the
command. However, it is also possible to specify a particular setup in a dataset's configuration. This is
particularly important when managing large collections of datasets. Configuration options are:

"datalad.gitlab-default-site"
Name of the default GitLab site (see --site) "datalad.gitlab-SITENAME-siblingname"
Name of the sibling configured for the local dataset that points
to the GitLab instance SITENAME (see --name) "datalad.gitlab-SITENAME-layout"
Project layout used at the GitLab instance SITENAME (see --layout) "datalad.gitlab-SITENAME-access"
Access method used for the GitLab instance SITENAME (see --access) "datalad.gitlab-SITENAME-project"
Project "location/path" used for a datasets at GitLab instance
SITENAME (see --project). Configuring this is useful for deriving
project paths for subdatasets, relative to superdataset.
The root-level group ("location") needs to be created beforehand via
GitLab's web interface. "datalad.gitlab-default-projectname"
The collection layout publishes (sub)datasets as projects
with a custom name. The default name "project" can be overridden with
this configuration. "datalad.gitlab-default-pathseparator"
The flat and collection layout represent subdatasets with project names
that correspond to their path within the superdataset, with the regular path separator replaced
with a "-": superdataset-subdataset. This configuration can be used to override
this default separator.

This command can be configured with "datalad.create-sibling-ghlike.extra-remote-settings.NETLOC.KEY=VAL‐
UE" in order to add any local KEY = VALUE configuration to the created sibling in the local `.git/config`
file. NETLOC is the domain of the Gitlab instance to apply the configuration for. This leads to a behav‐
ior that is equivalent to calling datalad's ``siblings('configure', ...)``||``siblings configure`` com‐
mand with the respective KEY-VALUE pair after creating the sibling. The configuration, like any other,
could be set at user- or system level, so users do not need to add this configuration to every sibling
created with the service at NETLOC themselves.

OPTIONS

PATH selectively create siblings for any datasets underneath a given path. By default only the root
dataset is considered.

-h, --help, --help-np
show this help message. --help-np forcefully disables the use of a pager for displaying the help
message

--site SITENAME
name of the GitLab site to create a sibling at. Must match an existing python-gitlab configuration
section with location and authentication settings (see https://python-gitlab.readthedo‐
cs.io/en/stable/cli-usage.html#configuration). By default the dataset configuration is consulted.
Constraints: value must be NONE or value must be a string

--project NAME/LOCATION
project name/location at the GitLab site. If a subdataset of the reference dataset is processed,
its project path is automatically determined by the LAYOUT configuration, by default. Users need
to create the root-level GitLab group (NAME) via the webinterface before running the command. Con‐
straints: value must be NONE or value must be a string

--layout {collection|flat}
layout of projects at the GitLab site, if a collection, or a hierarchy of datasets and subdatasets
is to be created. By default the dataset configuration is consulted. Constraints: value must be
one of ('collection', 'flat')

--dataset DATASET, -d DATASET
reference or root dataset. If no path constraints are given, a sibling for this dataset will be
created. In this and all other cases, the reference dataset is also consulted for the GitLab con‐
figuration, and desired project layout. If no dataset is given, an attempt is made to identify the
dataset based on the current working directory. Constraints: Value must be a Dataset or a valid
identifier of a Dataset (e.g. a path) or value must be NONE

-r, --recursive
if set, recurse into potential subdatasets.

-R LEVELS, --recursion-limit LEVELS
limit recursion into subdatasets to the given number of levels. Constraints: value must be con‐
vertible to type 'int' or value must be NONE

-s NAME, --name NAME
name to represent the GitLab sibling remote in the local dataset installation. If not specified a
name is looked up in the dataset configuration, or defaults to the SITE name. Constraints: value
must be a string or value must be NONE

--existing {skip|error|reconfigure}
desired behavior when already existing or configured siblings are discovered. 'skip': ignore; 'er‐
ror': fail, if access URLs differ; 'reconfigure': use the existing repository and reconfigure the
local dataset to use it as a sibling. Constraints: value must be one of ('skip', 'error', 'recon‐
figure') [Default: 'error']

--access {http|ssh|ssh+http}
access method used for data transfer to and from the sibling. 'ssh': read and write access used
the SSH protocol; 'http': read and write access use HTTP requests; 'ssh+http': read access is done
via HTTP and write access performed with SSH. Dataset configuration is consulted for a default,
'http' is used otherwise. Constraints: value must be one of ('http', 'ssh', 'ssh+http')

--publish-depends SIBLINGNAME
add a dependency such that the given existing sibling is always published prior to the new sib‐
ling. This equals setting a configuration item 'remote.SIBLINGNAME.datalad-publish-depends'. This
option can be given more than once to configure multiple dependencies. Constraints: value must be
a string or value must be NONE

--description DESCRIPTION
brief description for the GitLab project (displayed on the site). Constraints: value must be a
string or value must be NONE

--dryrun
Deprecated. Use the renamed ``--dry-run`` parameter.

--dry-run
if set, no repository will be created, only tests for name collisions will be performed, and
would-be repository names are reported for all relevant datasets.

--version
show the module and its version which provides the command

AUTHORS

        datalad is developed by The DataLad Team and Contributors <team@datalad.org>.

datalad create-sibling-gitlab 1.1.5                2025-03-03                   datalad create-sibling-gitlab(1)