lunar (1) dcm2xml.1.gz

Provided by: dcmtk_3.6.7-8build1_amd64 bug

NAME

       dcm2xml - Convert DICOM file and data set to XML

SYNOPSIS

       dcm2xml [options] dcmfile-in [xmlfile-out]

DESCRIPTION

       The dcm2xml utility converts the contents of a DICOM file (file format or raw data set) to
       XML (Extensible Markup Language). There are two output formats. The first one is  specific
       to  DCMTK  with  its DTD (Document Type Definition) described in the file dcm2xml.dtd. The
       second one refers to the 'Native DICOM Model' which is specified for the DICOM Application
       Hosting service found in DICOM part 19.

       If  dcm2xml  reads  a  raw data set (DICOM data without a file format meta-header) it will
       attempt to guess the transfer syntax by examining the first few bytes of the file.  It  is
       not  always  possible to correctly guess the transfer syntax and it is better to convert a
       data set to a file format whenever possible  (using  the  dcmconv  utility).  It  is  also
       possible  to  use  the  -f  and -t[ieb] options to force dcm2xml to read a data set with a
       particular transfer syntax.

PARAMETERS

       dcmfile-in   DICOM input filename to be converted

       xmlfile-out  XML output filename (default: stdout)

OPTIONS

   general options
         -h    --help
                 print this help text and exit

               --version
                 print version information and exit

               --arguments
                 print expanded command line arguments

         -q    --quiet
                 quiet mode, print no warnings and errors

         -v    --verbose
                 verbose mode, print processing details

         -d    --debug
                 debug mode, print debug information

         -ll   --log-level  [l]evel: string constant
                 (fatal, error, warn, info, debug, trace)
                 use level l for the logger

         -lc   --log-config  [f]ilename: string
                 use config file f for the logger

   input options
       input file format:

         +f    --read-file
                 read file format or data set (default)

         +fo   --read-file-only
                 read file format only

         -f    --read-dataset
                 read data set without file meta information

       input transfer syntax:

         -t=   --read-xfer-auto
                 use TS recognition (default)

         -td   --read-xfer-detect
                 ignore TS specified in the file meta header

         -te   --read-xfer-little
                 read with explicit VR little endian TS

         -tb   --read-xfer-big
                 read with explicit VR big endian TS

         -ti   --read-xfer-implicit
                 read with implicit VR little endian TS

       long tag values:

         +M    --load-all
                 load very long tag values (e.g. pixel data)

         -M    --load-short
                 do not load very long values (default)

         +R    --max-read-length  [k]bytes: integer (4..4194302, default: 4)
                 set threshold for long values to k kbytes

   processing options
       specific character set:

         +Cr   --charset-require
                 require declaration of extended charset (default)

         +Ca   --charset-assume  [c]harset: string
                 assume charset c if no extended charset declared

         +Cc   --charset-check-all
                 check all data elements with string values
                 (default: only PN, LO, LT, SH, ST, UC and UT)

                 # this option is only used for the mapping to an appropriate
                 # XML character encoding, but not for the conversion to UTF-8

         +U8   --convert-to-utf8
                 convert all element values that are affected
                 by Specific Character Set (0008,0005) to UTF-8

                 # requires support from an underlying character encoding library
                 # (see output of --version on which one is available)

   output options
       general XML format:

         -dtk  --dcmtk-format
                 output in DCMTK-specific format (default)

         -nat  --native-format
                 output in Native DICOM Model format (part 19)

         +Xn   --use-xml-namespace
                 add XML namespace declaration to root element

       DCMTK-specific format (not with --native-format):

         +Xd   --add-dtd-reference
                 add reference to document type definition (DTD)

         +Xe   --embed-dtd-content
                 embed document type definition into XML document

         +Xf   --use-dtd-file  [f]ilename: string
                 use specified DTD file (only with +Xe)
                 (default: /usr/local/share/dcmtk/dcm2xml.dtd)

         +Wn   --write-element-name
                 write name of the DICOM data elements (default)

         -Wn   --no-element-name
                 do not write name of the DICOM data elements

         +Wb   --write-binary-data
                 write binary data of OB and OW elements
                 (default: off, be careful with --load-all)

       encoding of binary data:

         +Eh   --encode-hex
                 encode binary data as hex numbers
                 (default for DCMTK-specific format)

         +Eu   --encode-uuid
                 encode binary data as a UUID reference
                 (default for Native DICOM Model)

         +Eb   --encode-base64
                 encode binary data as Base64 (RFC 2045, MIME)

DCMTK Format

       The basic structure of the DCMTK-specific XML output created from a DICOM file looks  like
       the following:

       <?xml version='1.0' encoding='ISO-8859-1'?>
       <!DOCTYPE file-format SYSTEM 'dcm2xml.dtd'>
       <file-format xmlns='http://dicom.offis.de/dcmtk'>
         <meta-header xfer='1.2.840.10008.1.2.1' name='Little Endian Explicit'>
           <element tag='0002,0000' vr='UL' vm='1' len='4'
                    name='MetaElementGroupLength'>
             166
           </element>
           ...
           <element tag='0002,0013' vr='SH' vm='1' len='16'
                    name='ImplementationVersionName'>
             OFFIS_DCMTK_353
           </element>
         </meta-header>
         <data-set xfer='1.2.840.10008.1.2' name='Little Endian Implicit'>
           <element tag='0008,0005' vr='CS' vm='1' len='10'
                    name='SpecificCharacterSet'>
             ISO_IR 100
           </element>
           ...
           <sequence tag='0028,3010' vr='SQ' card='2' name='VOILUTSequence'>
             <item card='3'>
               <element tag='0028,3002' vr='xs' vm='3' len='6'
                        name='LUTDescriptor'>
                 256\0\8
               </element>
               ...
             </item>
             ...
           </sequence>
           ...
           <element tag='7fe0,0010' vr='OW' vm='1' len='262144'
                    name='PixelData' loaded='no' binary='hidden'>
           </element>
         </data-set>
       </file-format>

       The 'file-format' and 'meta-header' tags are absent for DICOM data sets.

   XML Encoding
       Attributes  with very large value fields (e.g. pixel data) are not loaded by default. They
       can be identified by the additional attribute 'loaded' with a value of 'no'  (see  example
       above).  The  command line option --load-all forces to load all value fields including the
       very long ones.

       Furthermore, binary data of OB and OW attributes are not written to the XML output file by
       default.  These  elements  can  be  identified by the additional attribute 'binary' with a
       value of 'hidden' (default is 'no'). The command line  option  --write-binary-data  causes
       also  binary  value  fields  to be printed (attribute value is 'yes' or 'base64'). But, be
       careful when using this option together with --load-all because of the  large  amounts  of
       pixel  data  that might be printed to the output. Please note that in this context element
       values with a VR of OD, OF, OL and OV are not regarded as 'binary data'.

       Multiple values (i.e. where the DICOM value multiplicity is greater than 1) are  separated
       by  a  backslash  '\' (except for Base64 encoded data).  The 'len' attribute indicates the
       number of bytes for the particular value field as stored in the DICOM data  set,  i.e.  it
       might  deviate  from  the XML encoded value length e.g. because of non-significant padding
       that has been removed.  If this attribute is missing in 'sequence' or 'item'  start  tags,
       the corresponding DICOM element has been stored with undefined length.

       @section dcm2xml_native_format Native DICOM Model Format

       The  description of the Native DICOM Model format can be found in the DICOM standard, part
       19 ('Application Hosting').

       @subsection dcm2xml_bulk_data Bulk Data

       Binary data, i.e. DICOM element values with Value Representations (VR) of  OB  or  OW,  as
       well  as OD, OF, OL, OV and UN values are by default not written to the XML output because
       of their size.  Instead, for each element, a new Universally Unique Identifier  (UUID)  is
       being  generated and written as an attribute of a \<BulkData\> XML element.  So far, there
       is no possibility to write an additional file to hold the binary  data  for  each  of  the
       binary data chunks.  This is not required by the standard, however, it might be useful for
       implementing an Application Hosting interface; thus  this  feature  may  be  available  in
       future versions of \b dcm2xml.

       In  addition,  Supplement  163  (Store  Over  the  Web  by Representational State Transfer
       Services) introduces a new \<InlineBinary\> XML element that allows  for  encoding  binary
       data  as  Base64.   Currently,  the  command  line  option \e --encode-base64 enables this
       encoding for the following VRs: OB, OD, OF, OL, OV, OW and UN.

       @subsection dcm2xml_known_issues Known Issues

       In addition to what is written in the above section on  'Bulk  Data',  there  are  further
       known  issues  with  the  current  implementation  of  the Native DICOM Model format.  For
       example, large element values with a VR other than OB, OD,  OF,  OL,  OV,  OW  or  UN  are
       currently never written as bulk data, although it might be useful, e.g. for very long text
       elements (especially UT) or very long numeric fields (of various VRs).

       @section dcm2xml_notes NOTES

       @subsection dcm2xml_character_encoding Character Encoding

       The XML  encoding  is  determined  automatically  from  the  DICOM  attribute  (0008,0005)
       'Specific Character Set' using the following mapping:

       @verbatim  ASCII         (ISO_IR 6)    =>  'UTF-8' UTF-8         'ISO_IR 192'  =>  'UTF-8'
       ISO Latin 1   'ISO_IR 100'  =>  'ISO-8859-1' ISO Latin 2   'ISO_IR 101'  =>   'ISO-8859-2'
       ISO  Latin 3   'ISO_IR 109'  =>  'ISO-8859-3' ISO Latin 4   'ISO_IR 110'  =>  'ISO-8859-4'
       ISO Latin 5   'ISO_IR 148'  =>  'ISO-8859-9' Cyrillic      'ISO_IR 144'  =>   'ISO-8859-5'
       Arabic         'ISO_IR 127'  =>  'ISO-8859-6' Greek         'ISO_IR 126'  =>  'ISO-8859-7'
       Hebrew        'ISO_IR 138'  =>  'ISO-8859-8' \endverbatim

       If this DICOM attribute  is  missing  in  the  input  file,  although  needed,  option  \e
       --charset-assume  can  be used to specify an appropriate character set manually (using one
       of the DICOM defined terms).  For reasons of backward compatibility with previous versions
       of  this  tool,  the  following  terms  are also supported and mapped automatically to the
       associated DICOM defined terms: latin-1, latin-2,  latin-3,  latin-4,  latin-5,  cyrillic,
       arabic, greek, hebrew.

       Multiple  character  sets  using  code extension techniques are not supported.  If needed,
       option \e --convert-to-utf8 can be used to convert the DICOM file or  data  set  to  UTF-8
       encoding  prior  to  the conversion to XML format.  This is also useful for DICOMDIR files
       where each directory record can have a different character set.

       If no mapping is defined and option \e --convert-to-utf8 is not used, non-ASCII characters
       and  those  below  #32  are stored as '&#nnn;' where 'nnn' refers to the numeric character
       code.  This might lead to invalid character entity references (such as  '&#27;'  for  ESC)
       and will cause most XML parsers to reject the document.

       @section dcm2xml_logging LOGGING

       The level of logging output of the various command line tools and underlying libraries can
       be specified by the user.  By default,  only  errors  and  warnings  are  written  to  the
       standard  error  stream.   Using  option  \e  --verbose  also  informational messages like
       processing details are reported.  Option \e --debug can be used to get more details on the
       internal  activity,  e.g.  for  debugging  purposes.  Other logging levels can be selected
       using option \e --log-level.  In \e --quiet mode only fatal errors are reported.  In  such
       very severe error events, the application will usually terminate.  For more details on the
       different logging levels, see documentation of module 'oflog'.

       In case the logging output should be written to file (optionally with  logfile  rotation),
       to  syslog  (Unix)  or  the  event log (Windows) option \e --log-config can be used.  This
       configuration file also allows for directing only certain messages to a particular  output
       stream  and  for  filtering certain messages based on the module or application where they
       are    generated.     An     example     configuration     file     is     provided     in
       <em>\<etcdir\>/logger.cfg</em>.

       @section dcm2xml_command_line COMMAND LINE

       All  command line tools use the following notation for parameters: square brackets enclose
       optional values (0-1), three trailing dots  indicate  that  multiple  values  are  allowed
       (1-n), a combination of both means 0 to n values.

       Command  line  options  are  distinguished  from  parameters by a leading '+' or '-' sign,
       respectively.  Usually, order and position of command line  options  are  arbitrary  (i.e.
       they  can  appear  anywhere).   However,  if  options are mutually exclusive the rightmost
       appearance is used.  This behavior conforms to the standard  evaluation  rules  of  common
       Unix shells.

       In  addition,  one or more command files can be specified using an '@' sign as a prefix to
       the filename (e.g. <em>\@command.txt</em>).  Such a command argument is  replaced  by  the
       content  of  the  corresponding  text  file  (multiple whitespaces are treated as a single
       separator unless they appear between two quotation marks) prior to any further evaluation.
       Please  note  that  a  command  file cannot contain another command file.  This simple but
       effective approach allows one to summarize common combinations of  options/parameters  and
       avoids   longish   and   confusing   command   lines  (an  example  is  provided  in  file
       <em>\<datadir\>/dumppat.txt</em>).

       @section dcm2xml_environment ENVIRONMENT

       The \b dcm2xml utility will attempt to load DICOM data dictionaries specified  in  the  \e
       DCMDICTPATH  environment  variable.   By  default,  i.e. if the \e DCMDICTPATH environment
       variable is not set, the file <em>\<datadir\>/dicom.dic</em> will  be  loaded  unless  the
       dictionary is built into the application (default for Windows).

       The  default behavior should be preferred and the \e DCMDICTPATH environment variable only
       used when alternative data dictionaries are  required.   The  \e  DCMDICTPATH  environment
       variable  has  the  same  format  as the Unix shell \e PATH variable in that a colon (':')
       separates entries.  On Windows systems, a semicolon (';") is used as a separator. The data
       dictionary  code  will  attempt to load each file specified in the DCMDICTPATH environment
       variable. It is an error if no data dictionary can be loaded.

FILES

       <datadir>/dcm2xml.dtd - Document Type Definition (DTD) file

SEE ALSO

       xml2dcm(1), dcmconv(1)

       Copyright (C) 2002-2022 by OFFIS e.V., Escherweg 2, 26121 Oldenburg, Germany.