Provided by: dcmtk_3.6.9-6_amd64 bug

NAME

       dcm2xml - Convert DICOM file and data set to XML

SYNOPSIS

       dcm2xml [options] dcmfile-in [xmlfile-out]

DESCRIPTION

       The  dcm2xml  utility  converts  the  contents  of  a  DICOM  file  (file  format or raw data set) to XML
       (Extensible Markup Language). There are two output formats. The first one is specific to DCMTK  with  its
       DTD  (Document  Type  Definition) described in the file dcm2xml.dtd. The second one refers to the "Native
       DICOM Model" which is specified for the DICOM Application Hosting service found in DICOM part 19.

       If dcm2xml reads a raw data set (DICOM data without a file format meta-header) it will attempt  to  guess
       the  transfer syntax by examining the first few bytes of the file. It is not always possible to correctly
       guess the transfer syntax and it is better to convert a data set  to  a  file  format  whenever  possible
       (using  the  dcmconv  utility). It is also possible to use the -f and -t[ieb] options to force dcm2xml to
       read a data set with a particular transfer syntax.

PARAMETERS

       dcmfile-in   DICOM input filename to be converted ("-" for stdin)

       xmlfile-out  XML output filename (default: stdout)

OPTIONS

   general options
         -h    --help
                 print this help text and exit

               --version
                 print version information and exit

               --arguments
                 print expanded command line arguments

         -q    --quiet
                 quiet mode, print no warnings and errors

         -v    --verbose
                 verbose mode, print processing details

         -d    --debug
                 debug mode, print debug information

         -ll   --log-level  [l]evel: string constant
                 (fatal, error, warn, info, debug, trace)
                 use level l for the logger

         -lc   --log-config  [f]ilename: string
                 use config file f for the logger

   input options
       input file format:

         +f    --read-file
                 read file format or data set (default)

         +fo   --read-file-only
                 read file format only

         -f    --read-dataset
                 read data set without file meta information

       input transfer syntax:

         -t=   --read-xfer-auto
                 use TS recognition (default)

         -td   --read-xfer-detect
                 ignore TS specified in the file meta header

         -te   --read-xfer-little
                 read with explicit VR little endian TS

         -tb   --read-xfer-big
                 read with explicit VR big endian TS

         -ti   --read-xfer-implicit
                 read with implicit VR little endian TS

       long tag values:

         +M    --load-all
                 load very long tag values (e.g. pixel data)

         -M    --load-short
                 do not load very long values (default)

         +R    --max-read-length  [k]bytes: integer (4..4194302, default: 4)
                 set threshold for long values to k kbytes

   processing options
       specific character set:

         +Cr   --charset-require
                 require declaration of extended charset (default)

         +Ca   --charset-assume  [c]harset: string
                 assume charset c if no extended charset declared

         +Cc   --charset-check-all
                 check all data elements with string values
                 (default: only PN, LO, LT, SH, ST, UC and UT)

                 # this option is only used for the extended check whether
                 # the Specific Character Set (0008,0005) attribute should be
                 # present, but not for the conversion of unaffected element
                 # values to UTF-8 (e.g. element values with a VR of CS)

         +U8   --convert-to-utf8
                 convert all element values that are affected
                 by Specific Character Set (0008,0005) to UTF-8

                 # requires support from an underlying character encoding
                 # library (see output of --version on which one is available)

   output options
       general XML format:

         -dtk  --dcmtk-format
                 output in DCMTK-specific format (default)

         -nat  --native-format
                 output in Native DICOM Model format (part 19)

         +Xn   --use-xml-namespace
                 add XML namespace declaration to root element

       DCMTK-specific format (not with --native-format):

         +Xd   --add-dtd-reference
                 add reference to document type definition (DTD)

         +Xe   --embed-dtd-content
                 embed document type definition into XML document

         +Xf   --use-dtd-file  [f]ilename: string
                 use specified DTD file (only with +Xe)
                 (default: /usr/local/share/dcmtk-<VERSION>/dcm2xml.dtd)

         +Wn   --write-element-name
                 write name of the DICOM data elements (default)

         -Wn   --no-element-name
                 do not write name of the DICOM data elements

         +Wb   --write-binary-data
                 write binary data of OB and OW elements
                 (default: off, be careful with --load-all)

       encoding of binary data:

         +Eh   --encode-hex
                 encode binary data as hex numbers
                 (default for DCMTK-specific format)

         +Eu   --encode-uuid
                 encode binary data as a UUID reference
                 (default for Native DICOM Model)

         +Eb   --encode-base64
                 encode binary data as Base64 (RFC 2045, MIME)

DCMTK Format

       The basic structure of the DCMTK-specific XML output created from a DICOM file looks like the following:

       <?xml version="1.0" encoding="ISO-8859-1"?>
       <!DOCTYPE file-format SYSTEM "dcm2xml.dtd">
       <file-format xmlns="http://dicom.offis.de/dcmtk">
         <meta-header xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
           <element tag="0002,0000" vr="UL" vm="1" len="4"
                    name="MetaElementGroupLength">
             166
           </element>
           ...
           <element tag="0002,0013" vr="SH" vm="1" len="16"
                    name="ImplementationVersionName">
             OFFIS_DCMTK_353
           </element>
         </meta-header>
         <data-set xfer="1.2.840.10008.1.2" name="Little Endian Implicit">
           <element tag="0008,0005" vr="CS" vm="1" len="10"
                    name="SpecificCharacterSet">
             ISO_IR 100
           </element>
           ...
           <sequence tag="0028,3010" vr="SQ" card="2" name="VOILUTSequence">
             <item card="3">
               <element tag="0028,3002" vr="xs" vm="3" len="6"
                        name="LUTDescriptor">
                 256\0\8
               </element>
               ...
             </item>
             ...
           </sequence>
           ...
           <element tag="7fe0,0010" vr="OW" vm="1" len="262144"
                    name="PixelData" loaded="no" binary="hidden">
           </element>
         </data-set>
       </file-format>

       The "file-format" and "meta-header" tags are absent for DICOM data sets.

   XML Encoding
       Attributes with very large value fields (e.g. pixel  data)  are  not  loaded  by  default.  They  can  be
       identified  by  the  additional  attribute "loaded" with a value of "no" (see example above). The command
       line option --load-all forces to load all value fields including the very long ones.

       Furthermore, binary data of OB and OW attributes are not written to the XML output file by default. These
       elements can be identified by the additional attribute "binary" with a  value  of  "hidden"  (default  is
       "no").  The  command  line  option  --write-binary-data  causes  also  binary  value fields to be printed
       (attribute value is "yes" or "base64"). But, be careful when using this option together  with  --load-all
       because  of the large amounts of pixel data that might be printed to the output. Please note that in this
       context element values with a VR of OD, OF, OL and OV are not regarded as "binary data".

       Multiple values (i.e. where the DICOM value multiplicity is greater than 1) are separated by a  backslash
       "\"  (except  for  Base64  encoded  data).   The  "len"attribute  indicates  the  number of bytes for the
       particular value field asstored in the DICOM data set,  i.e.  it  might  deviate  from  the  XML  encoded
       valuelength  e.g.  because of non-significant padding that has been removed.  If thisattribute is missing
       in "sequence" or "item" start tags,  the  correspondingDICOM  element  has  been  stored  with  undefined
       length.@section  dcm2xml_native_format Native DICOM Model FormatThe description of the Native DICOM Model
       format can be found in the DICOMstandard, part 19 ("Application  Hosting").@subsection  dcm2xml_bulk_data
       Bulk  DataBinary  data,  i.e. DICOM element values with Value Representations (VR) of OBor OW, as well as
       OD, OF, OL, OV and UN values are by default not written to theXML output because of their size.  Instead,
       for each element, a new UniversallyUnique  Identifier  (UUID)  is  being  generated  and  written  as  an
       attribute  of  a\<BulkData\>  XML element.  So far, there is no possibility to write anadditional file to
       hold the binary data for each of the binary data chunks.This is not required by the standard, however, it
       might be useful forimplementing an Application Hosting interface; thus this feature  may  beavailable  in
       future  versions  of  \b  dcm2xml.In  addition,  Supplement  163  (Store Over the Web by Representational
       StateTransfer Services) introduces a new \<InlineBinary\> XML element that allowsfor encoding binary data
       as Base64.  Currently, the command line option\e --encode-base64 enables this encoding for the  following
       VRs:  OB,  OD,  OF,  OL,OV, OW and UN.@subsection dcm2xml_known_issues Known IssuesIn addition to what is
       written  in  the  above  section  on  "Bulk  Data",  there  arefurther  known  issues  with  the  current
       implementation  of  the Native DICOM Modelformat.  For example, large element values with a VR other than
       OB, OD, OF, OL,OV, OW or UN are currently never written as bulk data, although it  might  beuseful,  e.g.
       for  very  long  text  elements  (especially  UT)  or  very  long numericfields (of various VRs).@section
       dcm2xml_notes NOTES@subsection dcm2xml_character_encoding Character EncodingThe XML character encoding is
       determined automatically from the DICOM attribute(0008,0005) "Specific Character Set" using the following
       mapping:@verbatim ASCII         (ISO_IR 6)    =>  "UTF-8"UTF-8         "ISO_IR 192"  =>  "UTF-8"ISO Latin
       1   "ISO_IR 100"  =>  "ISO-8859-1"ISO Latin 2   "ISO_IR 101"  =>  "ISO-8859-2"ISO Latin 3   "ISO_IR  109"
       =>    "ISO-8859-3"ISO   Latin   4    "ISO_IR  110"   =>   "ISO-8859-4"ISO  Latin  5    "ISO_IR  148"   =>
       "ISO-8859-9"ISO   Latin   9     "ISO_IR   203"    =>    "ISO-8859-15"Cyrillic        "ISO_IR   144"    =>
       "ISO-8859-5"Arabic           "ISO_IR    127"     =>     "ISO-8859-6"Greek            "ISO_IR   126"    =>
       "ISO-8859-7"Hebrew        "ISO_IR 138"  =>  "ISO-8859-8"\endverbatimIf this DICOM attribute is missing in
       the input file, although needed,  option\e  --charset-assume  can  be  used  to  specify  an  appropriate
       character  setmanually (using one of the DICOM defined terms).  For reasons of backwardcompatibility with
       previous versions of this tool, the following terms are alsosupported and  mapped  automatically  to  the
       associated  DICOM  defined  terms:latin-1, latin-2, latin-3, latin-4, latin-5, latin-9, cyrillic, arabic,
       greek,hebrew.Multiple character sets using code extension techniques are not supported.  Ifneeded, option
       \e --convert-to-utf8 can be used to convert the DICOM file ordata set to  UTF-8  encoding  prior  to  the
       conversion  to  XML format.  This is alsouseful for DICOMDIR files where each directory record can have a
       differentcharacter set.If no mapping is defined  and  option  \e  --convert-to-utf8  is  not  used,  non-
       ASCIIcharacters  and  those  below  #32 are stored as "&#nnn;" where "nnn" refers to thenumeric character
       code.  This might lead to invalid character entity references(such as "&#27;" for  ESC)  and  will  cause
       most  XML  parsers  to reject the document.@section dcm2xml_logging LOGGINGThe level of logging output of
       the various command line tools and underlyinglibraries can be specified by the user.   By  default,  only
       errors and warningsare written to the standard error stream.  Using option \e --verbose alsoinformational
       messages  like  processing details are reported.  Option\e --debug can be used to get more details on the
       internal activity, e.g. fordebugging purposes.  Other logging  levels  can  be  selected  using  option\e
       --log-level.   In  \e  --quiet mode only fatal errors are reported.  In suchvery severe error events, the
       application will usually terminate.  For moredetails on the different logging levels,  see  documentation
       of module "oflog".In case the logging output should be written to file (optionally with logfilerotation),
       to  syslog  (Unix) or the event log (Windows) option \e --log-configcan be used.  This configuration file
       also allows for directing only certainmessages to a particular output stream and  for  filtering  certain
       messagesbased  on  the  module  or application where they are generated.  An exampleconfiguration file is
       provided in <em>\<etcdir\>/logger.cfg</em>.@section dcm2xml_command_line  COMMAND  LINEAll  command  line
       tools  use  the  following  notation  for parameters: squarebrackets enclose optional values (0-1), three
       trailing dots indicate thatmultiple values are allowed  (1-n),  a  combination  of  both  means  0  to  n
       values.Command  line options are distinguished from parameters by a leading '+' or '-'sign, respectively.
       Usually, order and position of command  line  options  arearbitrary  (i.e.  they  can  appear  anywhere).
       However,  if  options  are mutuallyexclusive the rightmost appearance is used.  This behavior conforms to
       thestandard evaluation rules of common Unix  shells.In  addition,  one  or  more  command  files  can  be
       specified  using  an  '@'  sign as aprefix to the filename (e.g. <em>\@command.txt</em>).  Such a command
       argumentis replaced by the content of the corresponding text file (multiplewhitespaces are treated  as  a
       single  separator unless they appear between twoquotation marks) prior to any further evaluation.  Please
       note that a commandfile cannot contain another command file.  This simple  but  effective  approachallows
       one  to summarize common combinations of options/parameters and avoidslongish and confusing command lines
       (an   example   is   provided   in   file<em>\<datadir\>/dumppat.txt</em>).@section   dcm2xml_environment
       ENVIRONMENTThe  \b  dcm2xml  utility  will  attempt  to  load  DICOM data dictionaries specifiedin the \e
       DCMDICTPATH environment variable.  By default, i.e. if the\e DCMDICTPATH environment variable is not set,
       the file<em>\<datadir\>/dicom.dic</em> will be loaded unless the dictionary is builtinto the  application
       (default for Windows).The default behavior should be preferred and the \e DCMDICTPATH environmentvariable
       only  used  when  alternative data dictionaries are required.  The\e DCMDICTPATH environment variable has
       the same format as the Unix shell\e PATH variable in that a colon (":") separates  entries.   On  Windows
       systems,a semicolon (";") is used as a separator. The data dictionary code will attempt to load each file
       specified in the DCMDICTPATH environment variable. It is an error if no data dictionary can be loaded.

       Depending  on  the command line options specified, the dcm2xml utility will attempt to load character set
       mapping tables. This happens when DCMTK was compiled with the oficonv library (which is the default)  and
       the mapping tables are not built into the library (default when DCMTK uses shared libraries).

       The  mapping  table files are expected in DCMTK's <datadir>. The DCMICONVPATH environment variable can be
       used to specify a different location. If a different location is specified,  those  mapping  tables  also
       replace any built-in tables.

FILES

       <datadir>/dcm2xml.dtd - Document Type Definition (DTD) file

SEE ALSO

       xml2dcm(1), dcmconv(1)

COPYRIGHT

       Copyright (C) 2002-2024 by OFFIS e.V., Escherweg 2, 26121 Oldenburg, Germany.

Version 3.6.9                               Wed Dec 10 2025 21:34:17                                  dcm2xml(1)