Provided by: ncbi-entrez-direct_12.0.20190816+ds-1ubuntu0.2_amd64 bug

NAME

       xtract - convert XML into a table of data values

SYNOPSIS

       xtract    [-help]    [-strict]    [-mixed]   [-accent]   [-ascii]   [-compress]   [-stops]
       [-input filename]  [-transform filename]   [-pattern expr]   [-group expr]   [-block expr]
       [-subset expr]     [-path path]     [-if expr [constraint]]    [-unless expr [constraint]]
       [-and condition] [-or condition]  [-else]  [-position pos]  [-equals str]  [-contains str]
       [-is-within str]  [-starts-with str]  [-ends-with str]  [-is-not str]  [-is-equal-to expr]
       [-differs-from expr] [-gt N] [-ge N] [-lt N] [-le N] [-eq N] [-ne N] [-ret str] [-tab str]
       [-sep str] [-pfx str] [-sfx str] [-plg str] [-elg str] [-rst] [-clr] [-pfc str] [-deq str]
       [-wrp tag]  [-def str]  [-lbl str]  [-element element]  [-first element]   [-last element]
       [-NAME]   [-num element]   [-len element]   [-sum element]  [-min element]  [-max element]
       [-inc element] [-dec element] [-sub element] [-avg element] [-dev element]  [-med element]
       [-bin element]    [-bit element]   [-encode element]   [-plain element]   [-upper element]
       [-lower element] [-title element]  [-year element]  [-translate element]  [-terms element]
       [-words element] [-pairs element] [-reverse element] [-letters element] [-clauses element]
       [-indices element] [-e2index] [-revcomp] [-nucleic] [-0-based element]  [-1-based element]
       [-ucsc-based element]   [-insd arg ...]   [-head str]   [-tail str]   [-hd str]  [-tl str]
       [-format fmt]   [-unicode style]    [-script style]    [-mathml terse]    [-filter element
       action target]  [-verify] [-outline] [-synopsis] [-select condition] [-in filename] [-j2x]
       [-set tag] [-rec tag] [-nest flat|recurse|plural|depth] [-examples] [-version]

DESCRIPTION

       xtract converts an XML document into a table of data values  according  to  user-specified
       rules.

OPTIONS

   Processing Flags
       -strict
              Remove HTML and MathML tags.

       -mixed Allow mixed content XML.

       -accent
              Delete Unicode accents and diacritical marks.

       -ascii Convert Unicode to numeric HTML character entities.

       -compress
              Compress runs of spaces.

       -stops Retain stop words in selected phrases.

   Data Source
       -input filename
              Read XML from file instead of standard input.

       -transform filename
              File of substitutions for -translate.

   Exploration Argument Hierarchy
       -pattern expr
       -group expr
       -block expr
       -subset expr
              Name  of  record  within  set.  Use of different argument names allows command-line
              control of nested looping.

   Path Navigation
       -path path
              Explore by list of adjacent object names.

   Exploration Constructs
       Object         DateRevised
       Parent/Child   Book/AuthorList
       Path           MedlineCitation/Article/Journal/JournalIssue/PubDate
       Heterogeneous  "PubmedArticleSet/*"
       Exhaustive     "History/**"
       Nested         "*/Taxon"
       Recursive      "**/Gene-commentary"

   Conditional Execution
       -if expr [constraint]
              Element (or @attribute) must exist and satisfy any specified constraint.

       -unless expr [constraint]
              Skip if element matches.

       -and condition
              Preceding and following tests must both pass.

       -or condition
              Any passing test suffices.

       -else  Execute if conditional test failed.

       -position pos
              first/last/outer/inner/even/odd/all.

   String Constraints
       -equals str
              String must match exactly.

       -contains str
              Substring must be present.

       -is-within str
              String must be present.

       -starts-with str
              Substring must be at beginning.

       -ends-with str
              Substring must be at end.

       -is-not str
              String must not match.

   Object Constraints
       -is-equal-to expr
              Object values must match.

       -differs-from expr
              Object values must differ.

   Numeric Constraints
       -gt N  Greater than.

       -ge N  Greater than or equal to.

       -lt N  Less than to.

       -le N  Less than or equal to.

       -eq N  Equal to.

       -ne N  Not equal to.

   Format Customization
       -ret str
              Override line break between patterns.

       -tab str
              Replace tab character between fields.

       -sep str
              Separator between group members.

       -pfx str
              Prefix to print before group.

       -sfx str
              Suffix to print after group.

       -plg str
              Prologue to print once before elements.

       -elg str
              Epilogue to print once after elements.

       -rst   Reset -sep through -elg.

       -clr   Clear queued tab separator.

       -pfc str
              Preface combines -clr and -pfx.

       -deq str
              Delete and replace queued tab separator.

       -wrp tag
              Wrap elements in XML object.

       -def str
              Default placeholder for missing fields.

       -lbl str
              Insert arbitrary text.

   Element Selection
       -element element
              Print all items that match tag name.

       -first element
              Only print value of first item.

       -last element
              Only print value of last item.

       -NAME  Record value in named variable.

   -element Constructs
       Tag            Caption
       Group          Initials,LastName
       Parent/Child   MedlineCitation/PMID
       Recursive      "**/Gene-commentary_accession"
       Unrestricted   PubDate/*
       Attribute      DescriptorName@MajorTopicYN
       Range          MedlineDate[1:4]
       Substring      "Title[phospholipase | rattlesnake]"
       Object Count   "#Author"
       Item Length    "%Title"
       Element Depth  "^PMID"
       Variable       "&NAME"

   Special -element Operations
       Parent Index   "+"
       Object Name    "+"
       XML Subtree    "*"
       Children       "$"
       Attributes     "@"

   Numeric Processing
       -num element
              Count.

       -len element
              Length.

       -sum element
              Sum.

       -min element
              Minimum.

       -max element
              Maximum.

       -inc element
              Increment.

       -dec element
              Decrement.

       -sub element
              Difference.

       -avg element
              Average.

       -dev element
              Deviation.

       -med element
              Median.

       -bin element
              Binary.

       -bit element
              Bit count.

   String Processing
       -encode element
              URL-encode <, >, &, ", and ' characters.

       -plain element
              Remove embedded mixed-content markup tags.

       -upper element
              Convert text to uppercase.

       -lower element
              Convert text to lowercase.

       -title element
              Capitalize initial letters of words.

       -year element
              Extract first 4-digit year from string.

       -translate element
              Substitute values with -transform table.

   Text Processing
       -terms element
              Partition text at spaces.

       -words element
              Split at punctuation marks.

       -pairs element
              Adjacent informative words.

       -reverse element
              Reverse words in string.

       -letters element
              Separate individual letters.

       -clauses element
              Break at phrase separators.

       -indices element
              Word pair index generation.

       -e2index
              Create Entrez index XML.

   Sequence Processing
       -revcomp
              Reverse-complement nucleotide sequence.

       -nucleic
              Subrange determines forward or revcomp.

   Sequence Coordinates
       -0-based element
              Zero-based.

       -1-based element
              One-based.

       -ucsc-based element
              Half-open.

   Command Generator
       -insd arg ...
              Generate INSDSeq extraction commands.  Print them if invoked standalone;  run  them
              if invoked as part of a pipeline.  Requires one or more arguments, which may appear
              in the following order:

              Descriptor(s)  INSDSeq_sequence/INSDSeq_definition/INSDSeq_division/... [...]

              Completeness   complete/partial

              Feature(s)     CDS/mRNA/...[,...]

              Qualifier(s)   INSDFeature_key/"#INSDInterval"/gene/product/sub_sequence/... [...]

   Miscellaneous
       -head str
              Print before everything else.

       -tail str
              Print after everything else.

       -hd str
              Print before each record.

       -tl str
              Print after each record.

   Phrase Filtering
       -require str
              Keep records that contain a given phrase.

       -exclude str
              Keep records that do not contain a given phrase.

   Reformatting
       -format fmt
              copy     Fast block copy (still applies processing flags).
              compact  Compress runs of spaces.
              flush    Suppress line indentation.
              indent   Indent according to nesting depth.
              expand   Place each attribute on a separate line.

       -unicode style
              How to handle Unicode superscript and subscript digits (first  converted  to  ASCII
              form in all cases).
              fuse     Run them all together, with no additional markup.
              space    Add spaces between digits in different positions.
              period   Add periods between digits in different positions.
              brackets Surround superscripts by square brackets and subscripts by parentheses.
              markdown Surround superscripts with carets and subscripts with tildes.
              slash    Add  backslashes  when  going  up in height and forward slashes when going
                       down.
              tag      Put superscripts in XML sup elements and subscripts in sub elements.

       -script style
              How to handle XML sup and  sub  elements  (denoting  superscripts  and  subscripts,
              respectively).
              brackets Surround superscripts by square brackets and subscripts by parentheses.
              markdown Surround superscripts with carets and subscripts with tildes.

       -mathml terse
              Flatten MathML markup tersely.

   Modification
       -filter element action target
              Actions:
              retain      Keep matching elements (no-op).
              remove      Remove matching elements.
              encode      HTML-escape special characters.
              decode      Decode HTML escapes.
              shrink      Compress runs of spaces.
              expand      Place each attribute on a separate line.
              accent      Strip off Unicode accents.

              Targets:
              content     Plain-text content.
              cdata       CDATA blocks.
              comment     Comments.
              object      The whole object.
              attributes  Attributes.
              container   Start and end tags.

   Validation
       -verify
              Report XML data integrity problems.

   Summary
       -outline
              Display outline of XML structure.

       -synopsis
              Display count of unique XML paths.

   Record Selection
       -select condition
              Select record subset by conditions.

       -in filename
              File of identifiers to select.

   Data Conversion
       -j2x   Convert JSON stream to XML suitable for -path navigation.

       -set tag
              Replace set wrapper tag.

       -rec tag
              Replace record wrapper tag.

       -nest flat|recurse|plural|depth
              Nested array naming policy.

   Documentation
       -help  Print usage information and some example argument combinations.

       -examples
              Complete examples of edirect(1) and xtract usage.

       -version
              Print version number.

NOTES

       String constraints use case-insensitive comparisons.

       Numeric constraints and selection arguments use integer values.

       -num and -len selections are synonyms for Object Count (#) and Item Length (%).

       -words, -pairs, and -indices convert to lower case.

SEE ALSO

       download-ncbi-data(1),    edirect(1),    esample(1),    index-bioc(1),    index-pubmed(1),
       pm-index(1), pm-invert(1), pm-stash(1), rchive(1), transmute(1), xml2tbl(1), xy-plot(1).