Ubuntu Manpage: ocf_linbit_drbd - Manages a DRBD device as a Master/Slave resource

NAME

       ocf_linbit_drbd - Manages a DRBD device as a Master/Slave resource

SYNOPSIS

       drbd [start | stop | monitor | promote | demote | meta-data | validate-all]

DESCRIPTION

       This resource agent manages a DRBD resource as a master/slave resource. DRBD is a
       shared-nothing replicated storage device.

       NOTE: To avoid data-divergence, you should enable either DRBD "quorum" and "on-no-quorum
       io-error" (recommended), or configure proper fencing policies in both DRBD *and* Pacemaker
       (fencing resource-and-stonith). This cannot be done from this resource agent alone.

       See the DRBD User's Guide for more information. https://docs.linbit.com/

SUPPORTED PARAMETERS

drbd_resource
The name of the drbd resource from the drbd.conf file.

(unique, required, string, no default)

drbdconf
Full path to the drbd.conf file.

(optional, string, default "/etc/drbd.conf")

adjust_master_score
Space separated list of four master score adjustments for different scenarios: - only
access to 'consistent' data - only remote access to 'uptodate' data - currently
Secondary, local access to 'uptodate' data, but remote is unknown - local access to
'uptodate' data, and currently Primary or remote is known

Numeric values are expected to be non-decreasing.

The first value is 0 by default to prevent pacemaker from trying to promote while it
is unclear whether the data is really the most recent copy. (DRBD knows it is
"consistent", but is unsure about "uptodate"ness). Please configure proper fencing
methods both in DRBD (fencing resource-and-stonith; appropriate (un)fence-peer
handlers) AND in Pacemaker to make this work reliably.

Advanced use: Adjust the other values to better fit into complex dependency score
calculations.

Intentionally diskless nodes ("Diskless Clients") with access to good data via some
(or all) their peers will use the 3rd or 4th value (minus one) when they are
(Secondary, not all peers up-to-date) or (ALL peers are up-to-date, or they are
Primary themselves). This may need to change if this should become a frequent use
case.

Special considerations:

If a Secondary DRBD is connected to a peer in Primary role, but Pacemaker does not
know about any Primary (using crm_resource --locate), we conclude that there likely is
a cluster-split-brain, and may try to "help" Pacemaker by removing the master-score.
Also see "remove_master_score_if_peer_primary".

(optional, string, default "0 10 1000 10000")

stop_outdates_secondary
Recommended setting: leave at default (disabled).

Note that this feature depends on the passed in information in
OCF_RESKEY_CRM_meta_notify_master_uname to be correct, which unfortunately is not
reliable for pacemaker versions up to at least 1.0.10 / 1.1.4.

If a Secondary is stopped (unconfigured), it may be marked as outdated in the drbd
meta data, if we know there is still a Primary running in the cluster. Note that this
does not affect fencing policies set in drbd config, but is an additional safety
feature of this resource agent only. You can enable this behaviour by setting the
parameter to true.

If this feature seems to not do what you expect, make sure you have defined fencing
policies in the drbd configuration as well.

(optional, boolean, default false)

ignore_missing_notifications
Some setups do not benefit from notifications. Allow to disable notifications without
patching this resource agent.

(optional, boolean, default false)

wfc_timeout
Unless set to the empty string or any non-digits, wait (at most) this many seconds for
the connection(s) to be established after bringing them up during "start".

(optional, integer, default 5)

remove_master_score_if_peer_primary
See also "adjust_master_score" and "fail_promote_early_if_peer_primary".

To prevent a potentially failed promotion attempt in case of cluster split-brain
(Pacemaker communication loss) while DRBD is still connected to a Primary, you can
request to remove any master score while DRBD is connected to a Primary (and that
Primary peer looks like it has all disks up-to-date).

This may delay legitimate failovers after Primary crash by up to some TCP timeout
(until DRBD realizes that the Primary is gone) plus one monitoring interval.

This parameter is interpreted almost as an "ocf boolean", with the exception of a
literal "unexpected", that is:

- "unexpected": assign master scores as described under "adjust_master_score", while
removing it if DRBD appears to see a (healthy) Primary that Pacemaker does not know
about (as determined by crm_resource --locate).

- everything else is "false": ignore the peer role while assigning master scores.

(optional, string, default "false")

fail_promote_early_if_peer_primary
See also "adjust_master_score" and "remove_master_score_if_peer_primary".

To avoid a useless retry loop during promotion attempts in case of cluster split-brain
(Pacemaker communication loss) while DRBD is still connected to a Primary, you can
chose to give up after the first try if this situation is detected.

If a Primary "vanishes", TCP may not immediately detect this, and an idle DRBD may
take some time until it does in-DRBD-protocol "pings". Pacemaker may well detect
Primary loss earlier than DRBD, and try to promote while DRBD thinks it can still see
a Primary. Which means, in general, trying to promote at least once is necessary, as
that implies an in-DRBD-protocol "peer alive" check.

But if that does not succeed, re-trying until we hit the operation timeout may not be
desired, so you can disable it.

(optional, boolean, default false)

unfence_if_all_uptodate
If all volumes of this resource report to be UpToDate, call an unfence script hook,
just in case some stale fencing constraint or similar is still around.

- With DRBD utils version <= 8.9.4, this is hardcoded to
/usr/lib/drbd/crm-unfence-peer.sh -r $DRBD_RESOURCE

- With DRBD utils version >= 8.9.5, this is dispatched to $DRBDADM unfence-peer
$DRBD_RESOURCE

In any case, the hook itself is responsible to fetch $OCF_RESKEY_unfence_extra_args
from its environment.

(optional, boolean, default false)

unfence_extra_args
This may be used to pass extra hints to the unfence hook. See description of
unfence_if_all_uptodate.

(optional, boolean, default --quiet --flock-required --flock-timeout 0
--unfence-only-if-owner-match)

require_drbd_module_version_ge
Use this you want to force failure of this resource agent if the detected DRBD kernel
(module) driver version is lower than a required minimum.

Example: use require_drbd_module_version_ge=9.0.16 to fail unless DRBD module version
>= 9.0.16 is available (effectively requires DRBD 9).

The intention of this is to give a more useful failure message after accidentally
downgrading the DRBD version by installing/upgrading a new kernel.

Note: "ge", "greater-or-equal", inclusive. Required format: x.y.z

Set empty to skip this check.

(optional, string, default "8.0.0")

require_drbd_module_version_lt
Use this you want to force failure of this resource agent if the detected DRBD kernel
(module) driver version is higher than a required maximum.

Example: use require_drbd_module_version_lt=9.0.0 to fail unless DRBD module version <
9.0 is available (effectively requires DRBD 8.4).

Note: "lt", "less-than", exclusive. Required format: x.y.z

Set empty to skip this check.

(optional, string, default "10.0.0")

connect_only_after_promote
This may be useful for "stacked" setups without proper fencing on the lower layer
(which we obviously do not recommend), to avoid some of the ugly side effects that may
arise after resolving a split-brain on the lower layer.

Keep this DRBD instance disconnected until it is promoted. After promotion we issue an
additional "adjust", which is supposed to initiate the connection attempts.

This causes a new data generation identifier ("current uuid") to be generated after
the failover of a "healthy" DRBD.

(optional, boolean, default false)

SUPPORTED ACTIONS

       This resource agent supports the following actions (operations):

       start
           Starts the resource. Suggested minimum timeout: 240.

       reload
           Suggested minimum timeout: 30.

       promote
           Promotes the resource to the Master role. Suggested minimum timeout: 90.

       demote
           Demotes the resource to the Slave role. Suggested minimum timeout: 90.

       notify
           Suggested minimum timeout: 90.

       stop
           Stops the resource. Suggested minimum timeout: 100.

       monitor (Slave role)
           Performs a detailed status check. Suggested minimum timeout: 20. Suggested interval:
           20.

       monitor (Master role)
           Performs a detailed status check. Suggested minimum timeout: 20. Suggested interval:
           10.

       meta-data
           Retrieves resource agent metadata (internal use only). Suggested minimum timeout: 5.

       validate-all
           Performs a validation of the resource configuration.

EXAMPLE CRM SHELL

       The following is an example configuration for a drbd resource using the crm(8) shell:

           primitive p_drbd ocf:linbit:drbd \
             params \
               drbd_resource=string \
             op monitor timeout="20" interval="20" role="Slave" \
             op monitor timeout="20" interval="10" role="Master"

           ms ms_drbd p_drbd \
             meta notify="true" interleave="true"

EXAMPLE PCS

       The following is an example configuration for a drbd resource using pcs(8)

           pcs resource create p_drbd ocf:linbit:drbd \
             drbd_resource=string \
             op monitor timeout="20" interval="20" role="Slave" \
             op monitor timeout="20" interval="10" role="Master" --master

AUTHORS

       LINBIT HA Solutions GmbH

NAME

SYNOPSIS

DESCRIPTION

SUPPORTED PARAMETERS

SUPPORTED ACTIONS

EXAMPLE CRM SHELL

EXAMPLE PCS

SEE ALSO

AUTHORS