Provided by: ncbi-entrez-direct_7.40.20170928+ds-1_amd64
NAME
xtract - convert XML into a table of data values
SYNOPSIS
xtract [-help] [-strict] [-mixed] [-accent] [-ascii] [-compress] [-spaces] [-input filename] [-pattern expr] [-group expr] [-block expr] [-subset expr] [-if expr [constraint]] [-unless expr [constraint]] [-and condition] [-or condition] [-else] [-position pos] [-equals str] [-contains str] [-starts-with str] [-ends-with str] [-is-not str] [-gt N] [-ge N] [-lt N] [-le N] [-eq N] [-ne N] [-ret str] [-tab str] [-sep str] [-pfx str] [-sfx str] [-clr] [-pfc str] [-rst] [-def str] [-lbl str] [-element element] [-first element] [-last element] [-NAME] [-num element] [-len element] [-sum element] [-min element] [-max element] [-inc element] [-dec element] [-sub element] [-avg element] [-dev element] [-encode element] [-upper element] [-lower element] [-title element] [-terms element] [-words element] [-pairs element] [-letters element] [-indices element] [-phrase str] [-0-based element] [-1-based element] [-ucsc-based element] [-insd arg ...] [-head str] [-tail str] [-hd str] [-tl str] [-format fmt] [-filter element action target] [-verify] [-outline] [-synopsis] [-archive directory] [-index element] [-flag strict|mixed|none] [-gzip] [-hash] [-skip filename] [-examples] [-extras] [-version]
DESCRIPTION
xtract converts an XML document into a table of data values according to user-specified rules.
OPTIONS
Processing Flags -strict Remove HTML highlight tags. -mixed Allow PubMed mixed content. -accent Delete Unicode accents. -ascii Convert Unicode to numeric character references. -compress Compress runs of spaces. -spaces Fix non-ASCII spaces. -input filename Read XML from file instead of standard input. Exploration Argument Hierarchy -pattern expr -group expr -block expr -subset expr Name of record within set. Use of different argument names allows command-line control of nested looping. Exploration Constructs Object DateCreated Parent/Child Book/AuthorList Heterogeneous "PubmedArticleSet/*" Nested "*/Taxon" Recursive "**/Gene-commentary" Conditional Execution -if expr [constraint] Element (or @attribute) must exist and satisfy any specified constraint. -unless expr [constraint] Skip if element matches. -and condition Preceding and following tests must both pass. -or condition Any passing test suffices. -else Execute if conditional test failed. -position pos Must be at first/last location in list. String Constraints -equals str String must match exactly. -contains str Substring must be present. -starts-with str Substring must be at beginning. -ends-with str Substring must be at end. -is-not str String must not match. Numeric Constraints -gt N Greater than. -ge N Greater than or equal to. -lt N Less than to. -le N Less than or equal to. -eq N Equal to. -ne N Not equal to. Format Customization -ret str Override line break between patterns. -tab str Replace tab character between fields. -sep str Separator between group members. -pfx str Prefix to print before group. -sfx str Suffix to print after group. -clr Clear queued tab separator. -pfc str Preface combines -clr and -pfx. -rst Reset -sep, -pfx, and -sfx. -def str Default placeholder for missing fields. -lbl str Insert arbitrary text. Element Selection -element element Print all items that match tag name. -first element Only print value of first item. -last element Only print value of last item. -NAME Record value in named variable. -element Constructs Tag Caption Group Initials,LastName Attribute DescriptorName@MajorTopicYN Recursive "**/Gene-commentary_accession" Object Count "#Author" Item Length "%Title" Element Depth "^PMID" Variable "&NAME" Special -element Operations Parent Index "+" XML Subtree "*" Children "$" Attributes "@" Numeric Processing -num element Count. -len element Length. -sum element Sum. -min element Minimum. -max element Maximum. -inc element Increment. -dec element Decrement. -sub element Difference. -avg element Average. -dev element Deviation. String Processing -encode element URL-encode <, >, &, ", and ' characters. -upper element Convert text to uppercase. -lower element Convert text to lowercase. -title element Capitalize initial letters of words. Phrase Processing -terms element Partition phrase at spaces. -words element Split at punctuation marks. -pairs element Adjacent informative words. -letters element Separate individual letters. -indices element Experimental index generation. Phrase Filtering -phrase str Keep records that contain a given phrase. Sequence Coordinates -0-based element Zero-based. -1-based element One-based. -ucsc-based element Half-open. Command Generator -insd arg ... Generate INSDSeq extraction commands. Print them if invoked standalone; run them if invoked as part of a pipeline. Requires one or more arguments, which may appear in the following order: Descriptor(s) INSDSeq_sequence/INSDSeq_definition/INSDSeq_division/... [...] Completeness complete/partial Feature(s) CDS/mRNA/...[,...] Qualifier(s) INSDFeature_key/"#INSDInterval"/gene/product/... [...] Miscellaneous -head str Print before everything else. -tail str Print after everything else. -hd str Print before each record. -tl str Print after each record. Reformatting -format fmt copy Fast block copy (still applies processing flags). compact Compress runs of spaces. flush Suppress line indentation. indent Indent according to nesting depth. expand Place each attribute on a separate line. Modification -filter element action target Actions: retain Keep matching elements (no-op). remove Remove matching elements. encode HTML-escape special characters. decode Decode HTML escapes. shrink Compress runs of spaces. expand Place each attribute on a separate line. accent Strip off Unicode accents. Targets: content Plain-text content. cdata CDATA blocks. comment Comments. object The whole object. attributes Attributes. container Start and end tags. Summary -outline Display outline of XML structure. -synopsis Display count of unique XML paths. Local Record Indexing -archive directory Base path for individual XML files. -index element Name of element to use for identifier. -flag strict|mixed|none -gzip Use compression for local XML files. -hash Print UIDs and checksum values to standard output. -skip filename File of UIDs to skip. Documentation -help Print usage information and some example argument combinations. -examples Complete examples of edirect(1) and xtract usage. -extras Batch and local processing examples, and a summary of specialized options the main -help text doesn't cover. -version Print version number.
NOTES
String constraints use case-insensitive comparisons. Numeric constraints and selection arguments use integer values. -num and -len selections are synonyms for Object Count (#) and Item Length (%). -words, -pairs, and -indices convert to lower case.
SEE ALSO
edirect(1), xy-plot(1).