oracular (7) htslib-s3-plugin.7.gz

Provided by: libhts3t64_1.20+ds-1_amd64 bug

NAME

       htslib-s3-plugin - htslib AWS S3 plugin

DESCRIPTION

       The S3 plugin allows htslib file functions to communicate with servers that use the AWS S3
       protocol.  Files are identified by their bucket and object key in a URL format e.g.

       s3://mybucket/path/to/file

       With path/to/file being the object key.

       Necessary security information can be provided in as  part  of  the  URL,  in  environment
       variables or from configuration files.

       The full URL format is:

       s3[+SCHEME]://[ID[:SECRET[:TOKEN]]@]BUCKET/PATH

       The elements are:

       SCHEME The protocol used.  Defaults to https.

       ID     The user AWS access key.

       SECRET The secret key for use with the access key.

       TOKEN  Token used for temporary security credentials.

       BUCKET AWS S3 bucket.

       PATH   Path to the object under the bucket.

       The environment variables below will be used if the user ID is not set.

       AWS_ACCESS_KEY_ID
              The user AWS access key.

       AWS_SECRET_ACCESS_KEY
              The secret key for use with the access key.

       AWS_DEFAULT_REGION
              The region to use. Defaults to us-east-1.

       AWS_SESSION_TOKEN
              Token used for temporary security credentials.

       AWS_DEFAULT_PROFILE
              The profile to use in credentials, config or s3cfg files.  Defaults to default.

       AWS_PROFILE
              Same as above.

       AWS_SHARED_CREDENTIALS_FILE
              Location of the credentials file.  Defaults to ~/.aws/credentials.

       HTS_S3_S3CFG
              Location of the s3cfg file.  Defaults to ~/.s3cfg.

       HTS_S3_HOST
              Sets the host.  Defaults to s3.amazonaws.com.

       HTS_S3_V2
              If  set  use  signature  v2  rather  the default v4.  This will limit the plugin to
              reading only.

       HTS_S3_PART_SIZE
              Sets the upload part size in Mb, the minimum being 5Mb.  By default the  part  size
              starts  at  5Mb and expands at regular intervals to accommodate bigger files (up to
              2.5 Tbytes with the current rate).  Using this setting disables the automatic  part
              size expansion.

       HTS_S3_ADDRESS_STYLE
              Sets the URL style.  Options are auto (default), virtual or path.

       In  the absence of an ID from the previous two methods the credential/config files will be
       used.  The default file locations are  either  ~/.aws/credentials  or  ~/.s3cfg  (in  that
       order).

       Entries  used  in aws style credentials file are aws_access_key_id, aws_secret_access_key,
       aws_session_token, region, addressing_style and expiry_time (unofficial,  see  SHORT-LIVED
       CREDENTIALS below).  Only the first two are usually needed.

       Entries  used  in  s3cmd  style  config  files  are  access_key, secret_key, access_token,
       host_base, bucket_location and host_bucket. Again only the first two are  usually  needed.
       The host_bucket option is only used to set a path-style URL, see below.

SHORT-LIVED CREDENTIALS

       Some  cloud  identity and access management (IAM) systems can make short-lived credentials
       that allow access to resources.  These credentials will expire after a time and need to be
       renewed  to  give  continued  access.  To enable this, the S3 plugin allows an expiry_time
       entry to be set in the .aws/credentials file.  The value for this entry should be the time
       when the token expires, following the format in RFC3339 section 5.6, which takes the form:

          2012-04-29T05:20:48Z

       That  is,  year  -  month  - day, the letter "T", hour : minute : second.  The time can be
       followed by the letter "Z", indicating the UTC timezone, or an offset from UTC which is  a
       "+"  or  "-" sign followed by two digits for the hours offset, ":", and two digits for the
       minutes.

       The S3 plugin will attempt to re-read the credentials file up to 1 minute before the given
       expiry  time,  which  means the file needs to be updated with new credentials before then.
       As the exact way of doing this can vary between services and IAM providers, the S3  plugin
       expects  this  to  be  done by an external user-supplied process.  This may be achieved by
       running a program that replaces  the  file  as  new  credentials  become  available.   The
       following script shows how it might be done for AWS instance credentials:

         #!/bin/sh
         instance='http://169.254.169.254'
         tok_url="$instance/latest/api/token"
         ttl_hdr='X-aws-ec2-metadata-token-ttl-seconds: 10'
         creds_url="$instance/latest/meta-data/iam/security-credentials"
         key1='aws_access_key_id = \(.AccessKeyId)\n'
         key2='aws_secret_access_key = \(.SecretAccessKey)\n'
         key3='aws_session_token = \(.Token)\n'
         key4='expiry_time = \(.Expiration)\n'
         while true; do
             token=`curl -X PUT -H "$ttl_hdr" "$tok_url"`
             tok_hdr="X-aws-ec2-metadata-token: $token"
             role=`curl -H "$tok_hdr" "$creds_url/"`
             expires='now'
             ( curl -H "$tok_hdr" "$creds_url/$role" \
               | jq -r "\"${key1}${key2}${key3}${key4}\"" > credentials.new ) \
               && mv -f credentials.new credentials \
               && expires=`grep expiry_time credentials | cut -d ' ' -f 3-`
             if test $? -ne 0 ; then break ; fi
             expiry=`date -d "$expires - 3 minutes" '+%s'`
             now=`date '+%s'`
             test "$expiry" -gt "$now" && sleep $((($expiry - $now) / 2))
             sleep 30
         done

       Note  that  the  expiry_time key is currently only supported for the .aws/credentials file
       (or the file referred to in the AWS_SHARED_CREDENTIALS_FILE environment variable).

NOTES

       In most cases this plugin transforms the given URL into a virtual host-style  format  e.g.
       https://bucket.host/path/to/file.   A  path-style  format is used where the URL is not DNS
       compliant or the bucket name contains a dot e.g.  https://host/bu.cket/path/to/file.

       Path-style can be forced by setting one either HTS_S3_ADDRESS_STYLE,  addressing_style  or
       host_bucket.   The  first  two  can  be set to path while host_bucket must not include the
       %(bucket).s string.

SEE ALSO

       htsfile(1) samtools(1)

       RFC 3339: <https://www.rfc-editor.org/rfc/rfc3339#section-5.6>

       htslib website: <http://www.htslib.org/>