Provided by: reposurgeon_4.3+git20200214.8d048e1-1ubuntu0.2_amd64 bug

NAME

       reposurgeon - surgical operations on repositories

SYNOPSIS

       reposurgeon [command...]

DESCRIPTION

       The purpose of reposurgeon is to enable risky operations that VCSes (version-control
       systems) don’t want to let you do, such as (a) editing past comments and metadata, (b)
       excising commits, (c) coalescing and splitting commits, (d) removing files and subtrees
       from repo history, (e) merging or grafting two or more repos, and (f) cutting a repo in
       two by cutting a parent-child link, preserving the branch structure of both child repos.

       A major use of reposurgeon is to assist a human operator to perform higher-quality
       conversions among version control systems than can be achieved with fully automated
       converters.

       The original motivation for reposurgeon was to clean up artifacts created by repository
       conversions. It was foreseen that the tool would also have applications when code needs to
       be removed from repositories for legal or policy reasons.

       To keep reposurgeon simple and flexible, it normally does not do its own repository
       reading and writing. Instead, it relies on being able to parse and emit the command
       streams created by git-fast-export and read by git-fast-import. This means that it can be
       used on any version-control system that has both fast-export and fast-import utilities.
       The git-import stream format also implicitly defines a common language of primitive
       operations for reposurgeon to speak.

       Fully supported systems (those for which reposurgeon can both read and write repositories)
       include git, hg, bzr, darcs, bk, RCS, and SRC. For a complete list, with dependencies and
       technical notes, type "prefer" to the reposurgeon prompt.

       Writing to the file-oriented systems RCS and SRC is done via rcs-fast-import(1) and has
       some serious limitations because those systems cannot represent all the metadata in a
       git-fast-export stream. Consult that tool’s documentation for details and partial
       workarounds.

       Fossil repository files can be read in using the --format=fossil option of the ‘read’
       command and written out with the --format=fossil option of the ‘write’. Ignore patterns
       are not translated in either direction.

       SVN and CVS are supported for read only, not write. For CVS, reposurgeon must be run from
       within a repository directory (one with a CVSROOT subdirectory). Each module becomes a
       subdirectory in the the reposurgeon representation of the change history.

       In order to deal with version-control systems that do not have fast-export equivalents,
       reposurgeon can also host extractor code that reads repositories directly. For each
       version-control system supported through an extractor, reposurgeon uses a small amount of
       knowledge about the system’s command-line tools to (in effect) replay repository history
       into an input stream internally. Repositories under systems supported through extractors
       can be read by reposurgeon, but not modified by it. In particular, reposurgeon can be used
       to move a repository history from any VCS supported by an extractor to any VCS supported
       by a normal importer/exporter pair.

       Mercurial repository reading is implemented with an extractor class; writing is handled
       with the stock "hg fastimport" command. A test extractor exists for git, but is normally
       disabled in favor of the regular exporter.

       For guidance on the pragmatics of repository conversion, see the DVCS Migration HOWTO
       <http://www.catb.org/esr/dvcs-migration-guide.html>.

SAFETY WARNINGS

       reposurgeon is a sharp enough tool to cut you. It takes care not to ever write a
       repository in an actually inconsistent state, and will terminate with an error message
       rather than proceed when its internal data structures are confused. However, there are
       lots of things you can do with it - like altering stored commit timestamps so they no
       longer match the commit sequence - that are likely to cause havoc after you’re done.
       Proceed with caution and check your work.

       Also note that, if your DVCS does the usual thing of making commit IDs a cryptographic
       hash of content and parent links, editing a publicly-accessible repository with this tool
       would be a bad idea. All of the surgical operations in reposurgeon will modify the hash
       chains.

       Please also see the notes on system-specific issues under LIMITATIONS AND GUARANTEES.

OPERATION

       The program can be run in one of two modes, either as an interactive command interpreter
       or in batch mode to execute commands given as arguments on the reposurgeon invocation
       line. The only differences between these modes are (1) the interactive one begins by
       turning on the ‘interactive’ option, (2) in batch mode all errors (including normally
       recoverable errors in selection-set syntax) are fatal, and (3) each command-line argument
       beginning with ‘--’ has that stripped off (which in particular means that --help and
       --version will work as expected). Also, in interactive mode, Ctrl-P and Ctrl-N will be
       available to scroll through your command history and tab completion of both command
       keywords and name arguments (wherever that makes semantic sense) is available.

       A git-fast-import stream consists of a sequence of commands which must be executed in the
       specified sequence to build the repo; to avoid confusion with reposurgeon commands we will
       refer to the stream commands as events in this documentation. These events are implicitly
       numbered from 1 upwards. Most commands require specifying a selection of event sequence
       numbers so reposurgeon will know which events to modify or delete.

       For all the details of event types and semantics, see the git-fast-import(1) manual page;
       the rest of this paragraph is a quick start for the impatient. Most events in a stream are
       commits describing revision states of the repository; these group together under a single
       change comment one or more fileops (file operations), which usually point to blobs that
       are revision states of individual files. A fileop may also be a delete operation
       indicating that a specified previously-existing file was deleted as part of the version
       commit; there are a couple of other special fileop types of lesser importance.

       Commands to reposurgeon consist of a command keyword, sometimes preceded by a selection
       set, sometimes followed by whitespace-separated arguments. It is often possible to omit
       the selection-set argument and have it default to something reasonable.

       Here are some motivating examples. The commands will be explained in more detail after the
       description of selection syntax.

           :15 edit               ;; edit the object associated with mark :15

           edit                   ;; edit all editable objects

           29..71 list            ;; list summary index of events 29..71

           236..$ list            ;; List events from 236 to the last

           <#523> inspect         ;; Look for commit #523; they are numbered
                                  ;; 1-origin from the beginning of the
                                  ;; repository.

           <2317> inspect         ;; Look for a tag with the name 2317, a tip
                                  ;; commit of a branch named 2317, or a commit
                                  ;; with legacy ID 2317. Inspect what is found.
                                  ;; A plain number is probably a legacy ID
                                  ;; inherited from a Subversion revision
                                  ;; number.

           /regression/ list      ;; list all commits and tags with comments or
                                  ;; committer headers or author headers
                                  ;; containing the string "regression"

           1..:97 & =T delete     ;; delete tags from event 1 to mark 97

           [Makefile] inspect     ;; Inspect all commits with a file op touching
                                  ;; Makefile and all blobs referred to in a
                                  ;; fileop touching Makefile.

           :46 tip                ;; Display the branch tip that owns
                                  ;; commit :46.

           @dsc(:55) list         ;; Display all commits with ancestry tracing
                                  ;; to :55

           @min([.gitignore]) remove .gitignore delete
                                  ;; Remove the first .gitignore fileop in the
                                  ;; repo.

       The regular expressions should be in Golang’s <https://github.com/google/re2/wiki/Syntax>
       format, with one exception. Due to a conflict with the use of $ for arguments in the
       script command, we retain Python’s use of backslashes as a leader for references to group
       matches.

       Regular expressions are not anchored. Use ^ and $ to anchor them to the beginning or end
       of the search space, when appropriate.

       Some commands may also take following arguments that are regular expressions. In this
       context, they still require start and end delimiters, but if you need to have a / in the
       expression the delimiters can be any printable character. As a reminder, these are
       described in the embedded help as "delimited" regular expressions.

       Also note that following-argument regular expressions may not contain whitespace; if you
       need to specify whitespace or a non-printable character use a standard C-style escape such
       as \s for space.

   SELECTION SYNTAX
       A selection set is ordered; that is, any given element may occur only one, and the set is
       ordered by when its members were first added.

       The selection-set specification syntax is an expression-oriented minilanguage. The most
       basic term in this language is a location. The following sorts of primitive locations are
       supported:

       event numbers
           A plain numeric literal is interpreted as a 1-origin event-sequence number.

       marks
           A numeric literal preceded by a colon is interpreted as a mark; see the import stream
           format documentation for explanation of the semantics of marks.

       tag and branch names
           The basename of a branch (including branches in the refs/tags namespace) refers to its
           tip commit. The name of a tag is equivalent to its mark (that of the tag itself, not
           the commit it refers to). Tag and branch locations are bracketed with < > (angle
           brackets) to distinguish them from command keywords.

       legacy IDs
           If the contents of name brackets (< >) does not match a tag or branch name, the
           interpreter next searches legacy IDs of commits. This is especially useful when you
           have imported a Subversion dump; it means that commits made from it can be referred to
           by their corresponding Subversion revision numbers.

       commit numbers
           A numeric literal within name brackets (< >) preceded by # is interpreted as a
           1-origin commit-sequence number.

       reset@ names
           A name with the prefix ‘reset@’ refers to the latest reset with a basename matching
           the part after the @. Usually there is only one such reset.

       $
           Refers to the last event.

       These may be grouped into sets in the following ways:

       ranges
           A range is two locations separated by ‘..’, and is the set of events beginning at the
           left-hand location and ending at the right-hand location (inclusive).

       lists
           Comma-separated lists of locations and ranges are accepted, with the obvious meaning.

       There are some other ways to construct event sets:

       visibility sets
           A visibility set is an expression specifying a set of event types. It will consist of
           a leading equal sign, followed by type letters. These are the type letters:

           ┌──┬─────────────────────────┬──────────────────────────┐
           │  │                         │                          │
           │B │ blobs                   │ Most default selection   │
           │  │                         │ sets exclude blobs; they │
           │  │                         │ have to be manipulated   │
           │  │                         │ through the commits they │
           │  │                         │ are attached to.         │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │C │ commits                 │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │D │ all-delete commits      │ These are artifacts      │
           │  │                         │ produced by some older   │
           │  │                         │ repository-conversion    │
           │  │                         │ tools.                   │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │H │ head (branch tip)       │                          │
           │  │ commits                 │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │O │ orphaned (parentless)   │                          │
           │  │ commits                 │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │U │ commits with callout    │                          │
           │  │ parents                 │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │Z │ commits with no fileops │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │M │ merge (multi-parent)    │                          │
           │  │ commits                 │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │F │ fork (multi-child)      │                          │
           │  │ commits                 │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │L │ commits with unclean    │ E.g. without a           │
           │  │ multi-line comments     │ separating empty line    │
           │  │                         │ after the first          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │I │ commits for which       │                          │
           │  │ metadata cannot be      │                          │
           │  │ decoded to UTF-8        │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │T │ tags                    │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │R │ resets                  │                          │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │P │ passthroughs            │ All event types simply   │
           │  │                         │ passed through,          │
           │  │                         │ including comments,      │
           │  │                         │ progress`commands, and   │
           │  │                         │ `checkpoint commands     │
           ├──┼─────────────────────────┼──────────────────────────┤
           │  │                         │                          │
           │N │ Legacy IDs              │ Any comment matching a   │
           │  │                         │ cookie (legacy-ID)       │
           │  │                         │ format.                  │
           └──┴─────────────────────────┴──────────────────────────┘

       references
           A reference name (bracketed by angle brackets) resolves to a single object, either a
           commit or tag.

           ┌──────────────┬────────────────────────────────┐
           │              │                                │
           │type          │ interpretation                 │
           ├──────────────┼────────────────────────────────┤
           │              │                                │
           │tag name      │ annotated tag with that name   │
           ├──────────────┼────────────────────────────────┤
           │              │                                │
           │branch name   │ the branch tip commit          │
           ├──────────────┼────────────────────────────────┤
           │              │                                │
           │legacy ID     │ commit with that legacy ID     │
           ├──────────────┼────────────────────────────────┤
           │              │                                │
           │assigned name │ name equated to a selection by │
           │              │ assign                         │
           └──────────────┴────────────────────────────────┘

           Note that if an annotated tag and a branch have the same name foo, <foo> will resolve
           to the tag rather than the branch tip commit.

       dates and action stamps
           A date or action stamp in angle brackets resolves to a selection set of all matching
           commits.

           ┌───────────────────────────────┬──────────────────────────────────┐
           │                               │                                  │
           │type                           │ interpretation                   │
           ├───────────────────────────────┼──────────────────────────────────┤
           │                               │                                  │
           │RFC3339 timestamp              │ commit or tag with that          │
           │                               │ time/date                        │
           ├───────────────────────────────┼──────────────────────────────────┤
           │                               │                                  │
           │action stamp (timestamp!email) │ commits or tags with that        │
           │                               │ timestamp and author (or         │
           │                               │ committer if no author). Aliases │
           │                               │ of the author are also accepted. │
           ├───────────────────────────────┼──────────────────────────────────┤
           │                               │                                  │
           │yyyy-mm-dd part of RFC3339     │ all commits and tags with that   │
           │timestamp                      │ date                             │
           └───────────────────────────────┴──────────────────────────────────┘

           To refine the match to a single commit, use a 1-origin index suffix separated by #.
           Thus <2000-02-06T09:35:10Z> can match multiple commits, but <2000-02-06T09:35:10Z#2>
           matches only the second in the set.

       text search
           A text search expression is a regular expression surrounded by forward slashes (to
           embed a forward slash in it, use a C-like string escape such as \x2f).

           A text search normally matches against the comment fields of commits and annotated
           tags, or against their author/committer names, or against the names of tags; also the
           text of passthrough objects.

           The scope of a text search can be changed with qualifier letters after the trailing
           slash. These are as follows:

           ┌───────┬─────────────────────────────────┐
           │       │                                 │
           │letter │ interpretation                  │
           ├───────┼─────────────────────────────────┤
           │       │                                 │
           │a      │ author name in commit           │
           ├───────┼─────────────────────────────────┤
           │       │                                 │
           │b      │ branch name in commit; also     │
           │       │ matches blobs referenced by     │
           │       │ commits on matching branches,   │
           │       │ and tags which point to commits │
           │       │ on matching branches.           │
           ├───────┼─────────────────────────────────┤
           │       │                                 │
           │c      │ comment text of commit or tag   │
           ├───────┼─────────────────────────────────┤
           │       │                                 │
           │r      │ committish reference in tag or  │
           │       │ reset                           │
           ├───────┼─────────────────────────────────┤
           │       │                                 │
           │p      │ text in passthrough             │
           ├───────┼─────────────────────────────────┤
           │       │                                 │
           │t      │ tagger in tag                   │
           ├───────┼─────────────────────────────────┤
           │       │                                 │
           │n      │ name of tag                     │
           ├───────┼─────────────────────────────────┤
           │       │                                 │
           │B      │ blob content                    │
           └───────┴─────────────────────────────────┘

           Multiple qualifier letters can add more search scopes.

           (The "b" qualifier replaces the branchset syntax in earlier versions of reposurgeon.)

       paths
           A "path expression" enclosed in square brackets resolves to the set of all commits and
           blobs related to a path matching the given expression. The path expression itself is
           either a path literal or a regular expression surrounded by slashes. Immediately after
           the trailing / of a path regexp you can put any number of the following characters
           which act as flags: ‘a’, ‘c’, ‘D’, ‘M’, ‘R’, ‘C’, ‘N’.

           By default, a path is related to a commit if the latter has a fileop that touches that
           file path - modifies that change it, deletes that remove it, renames and copies that
           have it as a source or target. When the ‘c’ flag is in use the meaning changes: the
           paths related to a commit become all paths that would be present in a checkout for
           that commit.

           A path literal matches a commit if and only if the path literal is exactly one of the
           paths related to the commit (no prefix or suffix operation is done). In particular a
           path literal won’t match if it corresponds to a directory in the chosen repository.

           A regular expression matches a commit if it matches any path related to the commit
           anywhere in the path. You can use ^ or $ if you want the expression to only match at
           the beginning or end of paths. When the ‘a’ flag is in use, the path expression
           selects commits whose every path matches the regular expression. This is necessarily a
           subset of commits selected without the ‘a’ flag because it also selects commits with
           no related paths (e.g. empty commits, deletealls and commits with empty trees). If you
           want to avoid those, you can use e.g. ‘[/regex/] & [/regex/a]’.

           The flags ‘D’, ‘M’, ‘R’, ‘C’, ‘N’ restrict match checking to the corresponding fileop
           types. Note that this means an ‘a’ match is easier (not harder) to achieve. These are
           no-ops when used with ‘c’.

           A path or literal matches a blob if it matches any path that appeared in a
           modification fileop that referred to that blob. To select purely matching blobs or
           matching commits, compose a path expression with =B or =C.

           If you need to embed ‘[^/]’ into your regular expression (e.g. to express "all
           characters but a slash") you can use a C-like string escape such as \x2f.

       function calls
           The expression language has named special functions. The sequence for a named function
           is “@” followed by a function name, followed by an argument in parentheses. Presently
           the following functions are defined:

           ┌─────┬─────────────────────────────────┐
           │     │                                 │
           │name │ interpretation                  │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │min  │ minimum member of a selection   │
           │     │ set                             │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │max  │ maximum member of a selection   │
           │     │ set                             │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │amp  │ nonempty selection set becomes  │
           │     │ all objects, empty set is       │
           │     │ returned empty                  │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │par  │ all parents of commits in the   │
           │     │ argument set                    │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │chn  │ all children of commits in the  │
           │     │ argument set                    │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │dsc  │ all commits descended from the  │
           │     │ argument set (argument set      │
           │     │ included)                       │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │anc  │ all commits whom the argument   │
           │     │ set is descended from (argument │
           │     │ set included)                   │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │pre  │ events before the argument set; │
           │     │ empty if the argument set       │
           │     │ includes the first event.       │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │suc  │ events after the argument set;  │
           │     │ empty if the argument set       │
           │     │ includes the last event.        │
           ├─────┼─────────────────────────────────┤
           │     │                                 │
           │srt  │ sort the argument set by event  │
           │     │ number.                         │
           └─────┴─────────────────────────────────┘

       Set expressions may be combined with the operators ‘|’ and ‘&’ which are, respectively,
       set union and intersection. The | has lower precedence than intersection, but you may use
       parentheses ‘(’ and ‘)’ to group expressions in case there is ambiguity (this replaces the
       curly brackets used in older versions of the syntax).

       Any set operation may be followed by ‘?’ to add the set members' neighbors and referents.
       This extends the set to include the parents and children of all commits in the set, and
       the referents of any tags and resets in the set. Each blob reference in the set is
       replaced by all commit events that refer to it. The ? can be repeated to extend the
       neighborhood depth. The result of a ? extension is sorted so the result is in ascending
       order.

       Do set negation with prefix ‘~’; it has higher precedence than & and | but lower than ?.

   IMPORT AND EXPORT
       reposurgeon can hold multiple repository states in core. Each has a name. At any given
       time, one may be selected for editing. Commands in this group import repositories, export
       them, and manipulate the in-core list and the selection.

       read [ --format=fossil ] [ --no-implicit ] [ directory | - | <infile ]
           With a directory-name argument, this command attempts to read in the contents of a
           repository in any supported version-control system under that directory; read with no
           arguments does this in the current directory. If output is redirected to a plain file,
           it will be read in as a fast-import stream or Subversion dumpfile. With an argument of
           ‘-’, this command reads a fast-import stream or Subversion dumpfile from standard
           input (this will be useful in filters constructed with command-line arguments).

           If the contents is a fast-import stream, any “cvs-revision” property on a commit is
           taken to be a newline-separated list of CVS revision cookies pointing to the commit,
           and used for reference lifting.

           If the contents is a fast-import stream, any “legacy-id” property on a commit is taken
           to be a legacy ID token pointing to the commit, and used for reference-lifting.

           If the read location is a git repository and contains a .git/cvsauthors file (such as
           is left in place by ‘git cvsimport -A’) that file will be read in as if it had been
           given to the ‘authors read’ command.

           If the read location is a directory, and its repository subdirectory has a file named
           legacy-map, that file will be read as though passed to a ‘legacy read’ command.

           If the read location is a file and the --format=fossil option is used, the file is
           interpreted as a Fossil repository.

           The just-read-in repo is added to the list of loaded repositories and becomes the
           current one, selected for surgery. If it was read from a plain file and the file name
           ends with one of the extensions ‘.fi’ or ‘.svn’, that extension is removed from the
           load list name.

           Normally, missing ‘from’ links in input streams are defaulted to the previous commit.
           The --no-implicit option disables this and may enable round-tripping of some streams
           on which it would fail (note however that git fast-export generates explicit ‘from’
           links). This option will mainly be useful for testing and debugging.

           Note: this command does not take a selection set.

       write [ --legacy ] [ --format=fossil ] [ --noincremental ] [ --callout ] [ >outfile | - ]
           Dump selected events as a fast-import stream representing the edited repository; the
           default selection set is all events. Where to dump to is standard output if there is
           no argument or the argument is ‘-’, or the target of an output redirect.

           Alternatively, if there is no redirect and the argument names a directory, the
           repository is rebuilt into that directory, with any selection set being ignored; if
           that target directory is nonempty its contents are backed up to a save directory.

           If the write location is a file and the --format=fossil option is used, the file is
           written in Fossil repository format.

           With the --legacy option, the Legacy-ID of each commit is appended to its commit
           comment at write time. This option is mainly useful for debugging conversion edge
           cases.

           If you specify a partial selection set such that some commits are included but their
           parents are not, the output will include incremental dump cookies for each branch with
           an origin outside the selection set, just before the first reference to that branch in
           a commit. An incremental dump cookie looks like “refs/heads/foo^0” and is a clue to
           export-stream loaders that the branch should be glued to the tip of a pre-existing
           branch of the same name. The --noincremental option suppresses this behavior.

           When you specify a partial selection set, including a commit object forces the
           inclusion of every blob to which it refers and every tag that refers to it.

           Specifying a partial selection may cause a situation in which some parent marks in
           merges don’t correspond to commits present in the dump. When this happens and
           --callout option was specified, the write code replaces the merge mark with a callout,
           the action stamp of the parent commit; otherwise the parent mark is omitted. Importers
           will fail when reading a stream dump with callouts; it is intended to be used by the
           ‘graft’ command.

           Specifying a write selection set with gaps in it is allowed but unlikely to lead to
           good results if it is loaded by an importer.

           Property extensions will be be omitted from the output if the importer for the
           preferred repository type cannot digest them.

           Note: to examine small groups of commits without the progress meter, use ‘inspect’.

       choose [ reponame ]
           Choose a named repo on which to operate. The name of a repo is normally the basename
           of the directory or file it was loaded from, but repos loaded from standard input are
           "unnamed". reposurgeon will add a disambiguating suffix if there have been multiple
           reads from the same source.

           With no argument, lists the names of the currently stored repositories and their load
           times. The second column is ‘*’ for the currently selected repository, ‘-’ for others.

       drop [ reponame ]
           Drop a repo named by the argument from reposurgeon’s list, freeing the memory used for
           its metadata and deleting on-disk blobs. With no argument, drops the currently chosen
           repo.

       rename reponame
           Rename the currently chosen repo; requires an argument. Won’t do it if there is
           already one by the new name.

   REBUILDS IN PLACE
       reposurgeon can rebuild an altered repository in place. Untracked files are normally saved
       and restored when the contents of the new repository is checked out (but see the
       documentation of the ‘preserve’ command for a caveat).

       rebuild [ directory ]
           Rebuild a repository from the state held by reposurgeon. This command does not take a
           selection set.

           The single argument, if present, specifies the target directory in which to do the
           rebuild; if the repository read was from a repo directory (and not a git-import
           stream), it defaults to that directory. If the target directory is nonempty its
           contents are backed up to a save directory. Files and directories on the repository’s
           preserve list are copied back from the backup directory after repo rebuild. The
           default preserve list depends on the repository type, and can be displayed with the
           ‘stats’ command.

           If reposurgeon has a nonempty legacy map, it will be written to a file named
           legacy-map in the repository subdirectory as though by a ‘legacy write’ command. (This
           will normally be the case for Subversion and CVS conversions.)

       preserve [ file... ]
           Add (presumably untracked) files or directories to the repo’s list of paths to be
           restored from the backup directory after a ‘rebuild’. Each argument, if any, is
           interpreted as a pathname. The current preserve list is displayed afterwards.

           It is only necessary to use this feature if your version-control system lacks a
           command to list files under version control. Under systems with such a command (which
           include git and hg), all files that are neither beneath the repository dot directory
           nor under reposurgeon temporary directories are preserved automatically.

       unpreserve [ file... ]
           Remove (presumably untracked) files or directories to the repo’s list of paths to be
           restored from the backup directory after a ‘rebuild’. Each argument, if any, is
           interpreted as a pathname. The current preserve list is displayed afterwards.

   TIMEQUAKES AND TIMEBUMPS
       Modifying a repository so every commit in it has a unique timestamp is often a useful
       thing to do, in order for every commit has a unique action stamp that can be referred to
       in surgical commands.

       timequake
           Attempt to hack committer and author time stamps in the selection set (defaulting to
           all commits in the repository) to be unique. Works by identifying collisions between
           parent and child, than incrementing child timestamps so they no longer coincide. Won’t
           touch commits with multiple parents.

           Because commits are checked in ascending order, this logic will normally do the right
           thing on chains of three or more commits with identical timestamps.

           Any timestamp collisions left after this operation are probably cross-branch and have
           to be individually dealt with using ‘timebump’ commands.

       timebump [ seconds ]
           Bump the committer and author timestamps of commits in the selection set (defaulting
           to empty) by one second. With following integer argument, that many seconds. Argument
           may be negative.

           Those of you twitchy about "rewriting history" should bear in mind that the commit
           stamps in many older repositories were never very reliable to begin with.

           CVS in particular is notorious for shipping client-side timestamps with timezone and
           DST issues (as opposed to UTC) that don’t necessary compare well with stamps from
           different clients of the same CVS server. Thus, inducing a timequake in a CVS repo
           seldom produces effects anywhere near as large than the measurement noise of the
           repository’s own timestamps.

           Subversion was somewhat better about this, as commits were stamped at the server, but
           older Subversion repositories often have sections that predate the era of ubiquitous
           NTP time.

   INFORMATION AND REPORTS
       Commands in this group report information about the selected repository.

       The output of these commands can individually be redirected to a named output file. Where
       indicated in the syntax, you can prefix the output filename with ‘>’ and give it as a
       following argument. If you use ‘>>’ the file is opened for append rather than write.

       list [ >outfile ]
           This is the main command for identifying the events you want to modify. It lists
           commits in the selection set by event sequence number with summary information. The
           first column is raw event numbers, the second a timestamp in local time. If the
           repository has legacy IDs, they will be displayed in the third column. The leading
           portion of the comment follows.

       stamp [ >outfile ]
           Alternative form of listing that displays full action stamps, usable as references in
           selections.

       tip [ >outfile ]
           Display the branch tip names associated with commits in the selection set. These will
           not necessarily be the same as their branch fields (which will often be tag names if
           the repo contains either annotated or lightweight tags).

           If a commit is at a branch tip, its tip is its branch name. If it has only one child,
           its tip is the child’s tip. If it has multiple children, then if there is a child with
           a matching branch name its tip is the child’s tip. Otherwise this function throws a
           recoverable error.

       tags [>outfile ]
           Display tags and resets: three fields, an event number and a type and a name. Branch
           tip commits associated with tags are also displayed with the type field ‘commit’.

       stats [ repo-name...] [>outfile ]
           Report size statistics and import/export method information about named repositories,
           or with no argument the currently chosen repository.

       count [>outfile ]
           Report a count of items in the selection set. Default set is everything in the
           currently-selected repo.

       inspect [>outfile ]
           Dump a fast-import stream representing selected events to standard output. Just like a
           write, except (1) the progress meter is disabled, and (2) there is an identifying
           header before each event dump.

       graph [>outfile ]
           Emit a visualization of the commit graph in the DOT markup language used by the
           graphviz tool suite. This can be fed as input to the main graphviz rendering program
           dot(1), which will yield a viewable image.

           You may find a script like this useful:

               graph $1 >/tmp/foo$$
               shell dot </tmp/foo$$ -Tpng | display -; rm /tmp/foo$$

           You can substitute in your own preferred image viewer, of course.

       sizes [>outfile ]
           Print a report on data volume per branch; takes a selection set, defaulting to all
           events. The numbers tally the size of uncompressed blobs, commit and tag comments, and
           other metadata strings (a blob is counted each time a commit points at it).

           The numbers are not an exact measure of storage size: they are intended mainly as a
           way to get information on how to efficiently partition a repository that has become
           large enough to be unwieldy.

       lint [ options ] [>outfile ]
           Look for DAG and metadata configurations that may indicate a problem. Presently checks
           for: (1) Mid-branch deletes, (2) disconnected commits, (3) parentless commits, (4) the
           existence of multiple roots, (5) committer and author IDs that don’t look well-formed
           as DVCS IDs, (6) multiple child links with identical branch labels descending from the
           same commit, (7) time and action-stamp collisions.

           Options to issue only partial reports are supported; ‘lint --options’ or ‘lint -?’
           lists them.

           The options and output format of this command are unstable; they may change without
           notice as more sanity checks are added.

       when timespec
           Interconvert between git timestamps (integer Unix time plus TZ) and RFC3339 format.
           Takes one argument, autodetects the format. Useful when eyeballing export streams.
           Also accepts any other supported date format and converts to RFC3339.

   SURGICAL OPERATIONS
       These are the operations the rest of reposurgeon is designed to support.

       squash [ policy... ]
           Combine or delete commits in a selection set of events. The default selection set for
           this command is empty. Has no effect on events other than commits unless the --delete
           policy is selected; see the ‘delete’ command for discussion.

           Normally, when a commit is squashed, its file operation list (and any associated blob
           references) gets either prepended to the beginning of the operation list of each of
           the commit’s children or appended to the operation list of each of the commit’s
           parents. Then children of a deleted commit get it removed from their parent set and
           its parents added to their parent set.

           The analogous operation is performed on commit comments, so no comment text is ever
           outright discarded. Exception: comments consisting of “*** empty log message ***”, as
           generated by CVS, are ignored.

           The default is to squash forward, modifying children; but see the list of policy
           modifiers below for how to change this.

               Warning
               It is easy to get the bounds of a squash command wrong, with confusing and
               destructive results. Beware thinking you can squash on a selection set to merge
               all commits except the last one into the last one; what you will actually do is to
               merge all of them to the first commit after the selected set.

           Normally, any tag pointing to a combined commit will also be pushed forward. But see
           the list of policy modifiers below for how to change this.

           Following all operation moves, every one of the altered file operation lists is
           reduced to a shortest normalized form. The normalized form detects various
           combinations of modification, deletion, and renaming and simplifies the operation
           sequence as much as it can without losing any information.

           The following modifiers change these policies:

           ┌──────────────┬──────────────────────────────────┐
           │              │                                  │
           │--delete      │ Simply discards all file ops and │
           │              │ tags associated with deleted     │
           │              │ commit(s).                       │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--no-coalesce │ Do not normalize the modified    │
           │              │ commit operations.               │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--pushback    │ Append fileops to parents,       │
           │              │ rather than prepending to        │
           │              │ children.                        │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--pushforward │ Prepend fileops to children.     │
           │              │ This is the default; it can be   │
           │              │ specified in a lift script for   │
           │              │ explicitness about intentions.   │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--tagforward  │ Any tag on the deleted commit is │
           │              │ pushed forward to the first      │
           │              │ child rather than being deleted. │
           │              │ This is the default; it can be   │
           │              │ specified for explicitness.      │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--tagback     │ Any tag on the deleted commit is │
           │              │ pushed backward to the first     │
           │              │ parent rather than being         │
           │              │ deleted.                         │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--quiet       │ Suppresses warning messages      │
           │              │ about deletion of commits with   │
           │              │ non-delete fileops.              │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--complain    │ The opposite of --quiet. Can be  │
           │              │ specified for explicitness.      │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--empty-only  │ Complain if a squash operation   │
           │              │ modifies a nonempty comment.     │
           ├──────────────┼──────────────────────────────────┤
           │              │                                  │
           │--blobs       │ Allow deletion of selected       │
           │              │ blobs.                           │
           └──────────────┴──────────────────────────────────┘

           Under any of these policies except --delete, deleting a commit that has children does
           not back out the changes made by that commit, as they will still be present in the
           blobs attached to versions past the end of the deletion set. All a delete does when
           the commit has children is lose the metadata information about when and by who those
           changes were actually made; after the delete any such changes will be attributed to
           the first undeleted children of the deleted commits. It is expected that this command
           will be useful mainly for removing commits mechanically generated by repository
           converters such as cvs2svn.

       delete [ policy... ]
           Delete a selection set of events. The default selection set for this command is empty.
           On a set of commits, this is equivalent to a squash with the --delete flag. It
           unconditionally deletes tags, resets, and passthroughs; blobs can be removed only as a
           side effect of deleting every commit that points at them.

       divide
           Attempt to partition a repo by cutting the parent-child link between two specified
           commits (they must be adjacent). Does not take a general selection set. It is only
           necessary to specify the parent commit, unless it has multiple children in which case
           the child commit must follow (separate it with a comma).

           If the repo was named ‘foo’, you will normally end up with two repos named ‘foo-early’
           and ‘foo-late’ (option and feature events at the beginning of the early segment will
           be duplicated onto the beginning of the late one.). But if the commit graph would
           remain connected through another path after the cut, the behavior changes. In this
           case, if the parent and child were on the same branch ‘qux’, the branch segments are
           renamed ‘qux-early’ and ‘qux-late’ but the repo is not divided.

       expunge [ --notagify ] [ path | /regexp/ ]...
           Expunge files from the selected portion of the repo history; the default is the entire
           history. The arguments to this command may be paths or regular expressions matching
           paths (regexps must be marked by being surrounded with //). String quotes and
           backslash escapes are interpreted when parsing the command line.

           All filemodify (M) operations and delete (D) operations involving a matched file in
           the selected set of events are disconnected from the repo and put in a removal set.
           Renames are followed as the tool walks forward in the selection set; each triggers a
           warning message. If a selected file is a copy (C) target, the copy will be deleted and
           a warning message issued. If a selected file is a copy source, the copy target will be
           added to the list of paths to be deleted and a warning issued.

           After file expunges have been performed, any commits with no remaining file operations
           will be removed, and any tags pointing to them. By default each deleted commit is
           replaced with a tag of the form ‘emptycommit-ident’ on the preceding commit unless
           --notagify is specified as an argument. Commits with deleted fileops pointing both in
           and outside the path set are not deleted, but are cloned into the removal set.

           The removal set is not discarded. It is assembled into a new repository named after
           the old one with the suffix ‘-expunges’ added. Thus, this command can be used to carve
           a repository into sections by file path matches.

       tagify [ --canonicalize ] [ --tipdeletes ] [ --tagify-merges ]
           Search for empty commits and turn them into tags. Takes an optional selection set
           argument defaulting to all commits. For each commit in the selection set, turn it into
           a tag with the same message and author information if it has no fileops. By default
           merge commits are not considered, even if they have no fileops (thus no tree
           differences with their first parent). To change that, use the --tagify-merges option.

           The name of the generated tag will be ‘emptycommit-ident’, where ident is generated
           from the legacy ID of the deleted commit, or from its mark, or from its index in the
           repository, with a disambiguation suffix if needed.

           With the --canonicalize option, tagify tries harder to detect trivial commits by first
           ensuring that all fileops of selected commits will have an actual effect when
           processed by fast-import.

           With the --tipdeletes option, tagify also considers branch tips with only deleteall
           fileops to be candidates for tagification. The corresponding tags get names of the
           form ‘tipdelete-branchname’ rather than the default ‘emptycommit-ident’.

           With the --tagify-merges option, tagify also tagifies merge commits that have no
           fileops. When this is done the merge link is move to the tagified commit’s parent.

       coalesce [ --debug | --changelog ] [ timefuzz ]
           Scan the selection set for runs of commits with identical comments close to each other
           in time (this is a common form of scar tissues in repository up-conversions from older
           file-oriented version-control systems). Merge these cliques by deleting all but the
           last commit, in order; fileops from the deleted commits are pushed forward to that
           last one

           The optional second argument, if present, is a maximum time separation in seconds; the
           default is 90 seconds.

           The default selection set for this command is =C, all commits. Occasionally you may
           want to restrict it, for example to avoid coalescing unrelated cliques of “*** empty
           log message ***” commits from CVS lifts.

           With the --debug option, show messages about mismatches.

           With the --changelog option, any commit with a comment containing the string ‘empty
           log message’ (such as is generated by CVS) and containing exactly one file operation
           modifying a path ending in ChangeLog is treated specially. Such ChangeLog commits are
           considered to match any commit before them by content, and will coalesce with it if
           the committer matches and the commit separation is small enough. This option handles a
           convention used by Free Software Foundation projects.

       split {at|by} item
           The first argument is required to be a commit location; the second is a preposition
           which indicates which splitting method to use. If the preposition is ‘at’, then the
           third argument must be an integer 1-origin index of a file operation within the
           commit. If it is ‘by’, then the third argument must be a pathname to be
           prefix-matched, pathname match is done first).

           The commit is copied and inserted into a new position in the event sequence,
           immediately following itself; the duplicate becomes the child of the original, and
           replaces it as parent of the original’s children. Commit metadata is duplicated; the
           new commit then gets a new mark. If the new commit has a legacy ID, the suffix
           ‘.split’ is appended to it.

           Finally, some file operations — starting at the one matched or indexed by the split
           argument — are moved forward from the original commit into the new one. Legal indices
           are 2-n, where n is the number of file operations in the original commit.

       add { D path | M perm mark path | R source target | C source target}
           To a selected commit, add a specified fileop.

           For a D operation to be valid there must be an M operation for the path in the
           commit’s ancestry. For an M operation to be valid, the ‘perm’ part must be a token
           ending with 755 or 644 and the ‘mark’ must refer to a blob that precedes the commit
           location. For an R or C operation to be valid, there must be an M operation for the
           source in the commit’s ancestry.

       remove [ index | path | deletes ] [ to commit ]
           From a selected commit, remove a specified fileop. The op must be one of (a) the
           keyword ‘deletes’, (b) a file path, (c) a file path preceded by an op type set (some
           subset of the letters DMRCN), or (d) a 1-origin numeric index. The ‘deletes’ keyword
           selects all D fileops in the commit; the others select one each.

           If the ‘to’ clause is present, the removed op is appended to the commit specified by
           the following singleton selection set. This option cannot be combined with ‘deletes’.

           Note that this command does not attempt to scavenge blobs even if the deleted fileop
           might be the only reference to them. This behavior may change in a future release.

       blob
           Create a blob at mark :1 after renumbering other marks starting from :2. Data is taken
           from stdin, which may be a here-doc. This can be used with the add command to patch
           synthetic data into a repository.

       renumber
           Renumber the marks in a repository, from :1 up to :<n> where <n> is the count of the
           last mark. Just in case an importer ever cares about mark ordering or gaps in the
           sequence.

           A side effect of this command is to clean up stray "done" passthroughs that may have
           entered the repository via graft operations. After a renumber, the repository will
           have at most one "done" and it will be at the end of the events.

       dedup
           Deduplicate blobs in the selection set. If multiple blobs in the selection set have
           the same SHA1, throw away all but the first, and change fileops referencing them to
           instead reference the (kept) first blob.

       msgout [ >outfile ] [ --filter=/regex/ ] [ --blobs ]
           Emit a file of messages in RFC2822 format representing the contents of repository
           metadata. Takes a selection set; members of the set other than commits, annotated
           tags, and passthroughs are ignored (that is, presently, blobs and resets), except that
           if the --blobs option is passed, blobs will also be included.

           May have an option --filter, followed by = and a /-enclosed regular expression. If
           this is given, only headers with names matching it are emitted. In this context the
           name of the header includes its trailing colon.

       msgin [ --create ] [ --empty-only ] [ <infile ] [ --changed >outfile ]
           Accept a file of messages in RFC2822 format representing the contents of the metadata
           in selected commits and annotated tags. Takes no selection set. If there is an
           argument it will be taken as the name of a message file to read from; if no argument,
           or one of ‘-’, reads from standard input.

           Users should be aware that modifying an Event-Number or Event-Mark field will change
           which event the update from that message is applied to. This is unlikely to have good
           results.

           The header CheckText, if present, is examined to see if the comment text of the
           associated event begins with it. If not, the item modification is aborted. This helps
           ensure that you are landing updates ob the events you intend.

           If the --create modifier is present, new tags and commits will be appended to the
           repository. In this case it is an error for a tag name to match any exting tag name.
           Commit objects are created with no fileops. If Committer-Date or Tagger-Date fields
           are not present they are filled in with the time at which this command is executed. If
           Committer or Tagger fields are not present, reposurgeon will attempt to deduce the
           user’s git-style identity and fill it in. If a singleton commit set was specified for
           commit creations, the new commits are made children of that commit.

           Otherwise, if the Event-Number and Event-Mark fields are absent, the msgin logic will
           attempt to match the commit or tag first by Legacy-ID, then by a unique committer ID
           and timestamp pair.

           If output is redirected and the modifier --changed appears, a minimal set of
           modifications actually made is written to the output file in a form that can be fed
           back in.

           If the option --empty-only is given, this command will throw a recoverable error if it
           tries to alter a message body that is neither empty nor consists of the CVS
           empty-comment marker.

       setfield attribute value
           In the selected objects (defaulting to none) set every instance of a named field to a
           string value. The string may be quoted to include whitespace, and use C-style
           backslash escapes, such as \n and \t.

           Attempts to set nonexistent attributes are ignored. Valid values for the attribute are
           internal field names; in particular, for commits, ‘comment’ and ‘branch’ are legal.
           Consult the source code for other interesting values.

           The special fieldnames ‘author’, ‘commitdate’ and ‘authdate’ apply only to commits in
           the range. The latter two sets attribution dates. The former sets the author’s name
           and email address (assuming the value can be parsed for both), copying the committer
           timestamp. The author’s timezone may be deduced from the email address.

       setperm {100644|100755|120000} path...
           For the selected objects (defaulting to none) take the first argument as an octal
           literal describing permissions. All subsequent arguments are paths. For each M fileop
           in the selection set and exactly matching one of the paths, patch the permission field
           to the first argument value.

       append [ --rstrip ] [text]
           Append text to the comments of commits and tags in the specified selection set. The
           text is the first token of the command and may be a quoted string. C-style escape
           sequences in the string are interpreted as one would expect.

           If the option --rstrip is given, the comment is right-stripped before the new text is
           appended. If the option --legacy is given, the string ‘%LEGACY%’ in the append payload
           is replaced with the commit’s legacy-ID before it is appended.

       filter [ --shell | --regex | --replace | --dedos ]
           Run blobs, commit comments, or tag comments in the selection set through the filter
           specified on the command line.

           In any mode other than --dedos, attempting to specify a selection set including both
           blobs and non-blobs (that is, commits or tags) throws an error. Inline content in
           commits is filtered when the selection set contains (only) blobs and the commit is
           within the range bounded by the earliest and latest blob in the specification.

           With --shell, the remainder of the line specifies a filter as a shell command. Each
           blob or comment is presented to the filter on standard input; the content is replaced
           with whatever the filter emits to standard output.

           When filtering blobs, if the command line contains the magic cookie '%PATHS%' it is
           replaced with a space-separated list of all paths that reference the blob.

           With --regex, the remainder of the line is expected to be a regular expression
           substitution written as /from/to/ with from and to being passed as arguments to the
           standard re.sub() function and it applied to modify the content. Actually, any
           non-space character will work as a delimiter in place of the /; this makes it easier
           to use / in patterns. Ordinarily only the first such substitution is performed;
           putting ‘g’ after the slash replaces globally, and a numeric literal gives the maximum
           number of substitutions to perform. Other flags available restrict substitution scope
           - ‘c’ for comment text only, ‘C’ for committer name only, ‘a’ for author names only.
           Note that parsing of a --regex argument will be confused by any substring consisting
           of whitespace followed by #; use ‘\s’ rather than whitespace to avoid this.

           With --replace, the behavior is like --regex but the expressions are not interpreted
           as regular expressions. (This is slightly faster).

           With --dedos, DOS/Windows-style \r\n line terminators are replaced with \n.

       transcode codec
           Transcode blobs, commit comments and committer/author names, or tag comments and tag
           committer names in the selection set to UTF-8 from the character encoding specified on
           the command line.

           Attempting to specify a selection set including both blobs and non-blobs (that is,
           commits or tags) throws an error. Inline content in commits is filtered when the
           selection set contains (only) blobs and the commit is within the range bounded by the
           earliest and latest blob in the specification.

           The encoding argument must name one of the codecs known to the Golang standard codecs
           library. In particular, ‘latin1’ is a valid codec name.

           Errors in this command are fatal, because an error may leave repository objects in a
           damaged state.

           The theory behind the design of this command is that the repository might contain a
           mixture of encodings used to enter commit metadata by different people at different
           times. After using =I to identify metadata containing non-Unicode high bytes in text,
           a human must use context to identify which particular encodings were used in
           particular event spans and compose appropriate transcode commands to fix them up.

       edit [ --blobs | --not-last ]
           Report the selection set of events to a tempfile as msgout does, call an editor on it,
           and update from the result as msgin does. If you do not specify an editor name as
           second argument, it will be taken from the $EDITOR variable in your environment. If
           $EDITOR is not set, /usr/bin/editor will be used as a fallback if it exists as a
           symlink to your default editor, as is the case on Debian, Ubuntu and their
           derivatives.

           Normally this command ignores blobs because msgout does. However, if you specify a
           selection set consisting of a single blob, your editor will be called directly on the
           blob file; alternatively, as with msgout, the --blobs option will include blobs in the
           file.

           In the singleton blob case (without --blobs), will warn if the blob to be edited
           appears in any commits whose descendants modify the same blob (since changes will not
           propagate to the descendant versions). This warning may be suppressed (e.g. in
           scripts) with the --not-last option.

           Supports < and > redirection.

       timeoffset offset [ timezone ]
           Apply a time offset to all time/date stamps in the selected set. An offset argument is
           required; it may be in the form [+-]ss, [+-]mm:ss or [+-]hh:mm:ss. The leading sign is
           required to distinguish it from a selection expression.

           Optionally you may also specify another argument in the form [+-]hhmm, a timezone
           literal to apply. To apply a timezone without an offset, use an offset literal of +0
           or -0.

       unite [ --prune ] reponame...
           Unite repositories. Name any number of loaded repositories; they will be united into
           one union repo and removed from the load list. The union repo will be selected.

           The root of each repo (other than the oldest repo) will be grafted as a child to the
           last commit in the dump with a preceding commit date. This will produce a union
           repository with one branch for each part. Running last to first, duplicate tag and
           branch names will be disambiguated using the source repository name (thus, recent
           duplicates will get priority over older ones). After all grafts, marks will be
           renumbered.

           The name of the new repo will be the names of all parts concatenated, separated by
           ‘+’. It will have no source directory or preferred system type.

           With the option --prune, at each join D operations for every ancestral file existing
           will be prepended to the root commit, then it will be canonicalized using the rules
           for squashing the effect will be that only files with properly matching M, R, and C
           operations in the root survive.

       graft [ --prune ] reponame
           For when unite doesn’t give you enough control. This command may have either of two
           forms, selected by the size of the selection set. The first argument is always
           required to be the name of a loaded repo.

           If the selection set is of size 1, it must identify a single commit in the currently
           chosen repo; in this case the named repo’s root will become a child of the specified
           commit. If the selection set is empty, the named repo must contain one or more
           callouts matching a commits in the currently chosen repo.

           Labels and branches in the named repo are prefixed with its name; then it is grafted
           to the selected one. Any other callouts in the named repo are also resolved in the
           context of the currently chosen one. Finally, the named repo is removed from the load
           list.

           With the option --prune, prepend a deleteall operation into the root of the grafted
           repository.

       path source rename [ --force ] target
           Rename a path in every fileop of every selected commit. The default selection set is
           all commits. The first argument is interpreted as a regular expression to match
           against paths; the second may contain back-reference syntax.

           Ordinarily, if the target path already exists in the fileops, or is visible in the
           ancestry of the commit, this command throws an error. With the --force option, these
           checks are skipped.

       paths [ sub | sup ] [ dirname ] [ >outfile ]
           Takes a selection set. Without a modifier, list all paths touched by fileops in the
           selection set (which defaults to the entire repo). This reporting variant does
           >-redirection.

           With the ‘sub’ modifier, take a second argument that is a directory name and prepend
           it to every path. With the ‘sup’ modifier, strip any directory argument from the start
           of the path if it appears there; with no argument, strip the first directory component
           from every path.

       merge
           Create a merge link. Takes a selection set argument, ignoring all but the lowest
           (source) and highest (target) members. Creates a merge link from the highest member
           (child) to the lowest (parent).

       unmerge
           Linearize a commit. Takes a selection set argument, which must resolve to a single
           commit, and removes all its parents except for the first. It is equivalent to
           'first_parent, commit reparent --rebase', where commit is the same selection set as
           used with unmerge and first_parent is a set resolving commit’s first parent (see the
           reparent command below). The main interest of the unmerge is that you don’t have to
           find and specify the first parent yourself, saving time and avoiding errors when
           nearby surgery would make a manual first parent argument stale.

       reparent [ options... ] [ policy ]
           Changes the parent list of a commit. Takes a selection set, zero or more option
           arguments, and an optional policy argument.

           Selection set: The selection set must resolve to one or more commits. The selected
           commit with the highest event number (not necessarily the last one selected) is the
           commit to modify. The remainder of the selected commits, if any, become its parents:
           the selected commit with the lowest event number (which is not necessarily the first
           one selected) becomes the first parent, the selected commit with second lowest event
           number becomes the second parent, and so on. All original parent links are removed.
           Examples:

               # this makes 17 the parent of 33
               17,33 reparent

               # this also makes 17 the parent of 33
               33,17 reparent

               # this makes 33 a root (parentless) commit
               33 reparent

               # this makes 33 an octopus merge commit. its first parent
               # is commit 15, second parent is 17, and third parent is 22
               22,33,15,17 reparent

           The option --use-order says to use the selection order to determine which selected
           commit is the commit to modify and which are the parents (and if there are multiple
           parents, their order). The last selected commit (not necessarily the one with the
           highest event number) is the commit to modify, the first selected commit (not
           necessarily the one with the lowest event number) becomes the first parent, the second
           selected commit becomes the second parent, and so on. Examples:

               # this makes 33 the parent of 17
               33,17 reparent --use-order

               # this makes 17 an octopus merge commit. its first parent
               # is commit 22, second parent is 33, and third parent is 15
               22,33,15,17 reparent --use-order

           Because ancestor commit events must appear before their descendants, giving a commit
           with a low event number a parent with a high event number triggers a re-sort of the
           events. A re-sort assigns different event numbers to some or all of the events.
           Re-sorting only works if the reparenting does not introduce any cycles. To swap the
           order of two commits that have an ancestor–descendant relationship without introducing
           a cycle during the process, you must reparent the descendant commit first.

           By default, the manifest of the reparented commit is computed before modifying it; a
           deleteall and some fileops are prepended so that the manifest stays unchanged even
           when the first parent has been changed. This behavior can be changed by specifying a
           policy flag. --rebase. That inhibits the default behavior—no deleteall is issued and
           the tree contents of all descendents can be modified as a result.

       reorder [ --quiet ]
           Re-order a contiguous range of commits.

           Older revision control systems tracked change history on a per-file basis, rather than
           as a series of atomic changesets, which often made it difficult to determine the
           relationships between changes. Some tools which convert a history from one revision
           control system to another attempt to infer changesets by comparing file commit comment
           and time-stamp against those of other nearby commits, but such inference is a
           heuristic and can easily fail. In the best case, when inference fails, a range of
           commits in the resulting conversion which should have been coalesced into a single
           changeset instead end up as a contiguous range of separate commits. This situation
           typically can be repaired easily enough with the coalesce or squash commands.

           However, in the worst case, numerous commits from several different topics, each of
           which should have been one or more distinct changesets, may end up interleaved in an
           apparently chaotic fashion. To deal with such cases, the commits need to be
           re-ordered, so that those pertaining to each particular topic are clumped together,
           and then possibly squashed into one or more changesets pertaining to each topic. This
           command, reorder, can help with the first task; the squash command with the second.

           Selected commits are re-arranged in the order specified; for instance: ‘:7,:5,:9,:3
           reorder’. The specified commit range must be contiguous; each commit must be accounted
           for after re-ordering. Thus, for example, ‘:5’ can not be omitted from ‘:7,:5,:9,:3
           reorder’. (To drop a commit, use the delete or squash command.)

           The selected commits must represent a linear history, however, the lowest numbered
           commit being re-ordered may have multiple parents, and the highest numbered may have
           multiple children. Re-ordered commits and their immediate descendants are inspected
           for rudimentary fileops inconsistencies. Warns if re-ordering results in a commit
           trying to delete, rename, or copy a file before it was ever created. Likewise, warns
           if all of a commit’s fileops become no-ops after re-ordering. Other fileops
           inconsistencies may arise from re-ordering, both within the range of affected commits
           and beyond; for instance, moving a commit which renames a file ahead of a commit which
           references the original name. Such anomalies can be discovered via manual inspection
           and repaired with the add and remove (and possibly path) commands. Warnings can be
           suppressed with --quiet.

           In addition to adjusting their parent/child relationships, re-ordering commits also
           re-orders the underlying events since ancestors must appear before descendants, and
           blobs must appear before commits which reference them. This means that events within
           the specified range will have different event numbers after the operation.

       branch branchname { rename | delete } [ arg ]
           Rename or delete a branch (and any associated resets). First argument must be an
           existing branch name; second argument must be one of the verbs ‘rename’ or ‘delete’.

           For a ‘rename’, the third argument may be any token that is a syntactically valid
           branch name (but not the name of an existing branch). If it does not contain a ‘/’ the
           prefix ‘heads/’ is prepended. If it does not begin with ‘refs/’, then ‘refs/’ is
           prepended.

           For a ‘delete’, the name may optionally be a regular expression wrapped in //; if so,
           all objects of the specified type with names matching the regexp are deleted. This is
           useful for mass deletion of branches. Such deletions can be restricted by a selection
           set in the normal way. No third argument is required.

       tag tagname { create | move | rename | delete } [ arg ]
           Create, move, rename, or delete a tag.

           Creation is a special case. First argument is a name, which must not be an existing
           tag. Takes a singleton event second argument which must point to a commit. A tag
           object pointing to the commit is created and inserted just after the last tag in the
           repo (or just after the last commit if there are no tags). The tagger, committish, and
           comment fields are copied from the commit’s committer, mark, and comment fields.

           Otherwise, first argument must be an existing tag name; second argument must be one of
           the verbs ‘move’, ‘rename’, or ‘delete’.

           For a ‘move’, a third argument must be a singleton selection set. For a "rename", the
           third argument may be any token that is a syntactically valid tag name (but not the
           name of an existing tag). For a "delete", no third argument is required.

           For a ‘delete’, no third argument is required. The name portion of a delete may be a
           regexp wrapped in //; if so, all objects of the specified type with names matching the
           regexp are deleted. This is useful for mass deletion of junk tags such as CVS
           branch-root tags.

           The tagname may use C-style backslash escapes, such as \s.

           The behavior of this command is complex because features which present as tags may be
           any of three things: (1) True tag objects, (2) lightweight tags, actually sequences of
           commits with a common branchname beginning with ‘refs/tags’ - in this case the tag is
           considered to point to the last commit in the sequence, (3) Reset objects. These may
           occur in combination; in fact, stream exporters from systems with annotation tags
           commonly express each of these as a true tag object (1) pointing at the tip commit of
           a sequence (2) in which the basename of the common branch field is identical to the
           tag name. An exporter that generates lightweight-tagged commit sequences (2) may or
           may not generate resets pointing at their tip commits.

           This command tries to handle all combinations in a natural way by doing up to three
           operations on any true tag, commit sequence, and reset matching the source name. In a
           rename, all are renamed together. In a delete, any matching tag or reset is deleted;
           then matching branch fields are changed to match the branch of the unique descendent
           of the tagged commit, if there is one. When a tag is moved, no branch fields are
           changed and a warning is issued.

           Attempts to delete a lightweight tag may fail with the message “couldn’t determine a
           unique successor”. When this happens, the tag is on a commit with multiple children
           that have different branch labels. There is a hole in the specification of git
           fast-import streams that leaves it uncertain how branch labels can be safely
           reassigned in this case; rather than do something risky, reposurgeon throws a
           recoverable error.

       reset resetname  { create | move | rename | delete } [ arg ]
           Create, move, rename, or delete a reset. Create is a special case; it requires a
           singleton selection which is the associated commit for the reset, takes as a first
           argument the name of the reset (which must not exist), and ends with the keyword
           create.

           In the other modes, the first argument must match an existing reset name; second
           argument must be one of the verbs ‘move’, ‘rename’, or ‘delete’.

           The reset name may use C-style backslash escapes, such as \s.

           For a ‘move’, a third argument must be a singleton selection set. For a ‘rename’, the
           third argument may be any token that matches a syntactically valid reset name (but not
           the name of an existing reset). For a ‘delete’, no third argument is required.

           For either name, if it does not contain a ‘/’ the prefix ‘heads/’ is prepended. If it
           does not begin with ‘refs/’, ‘refs/’ is prepended.

           An argument matches a reset’s name if it is either the entire reference
           (refs/heads/FOO or refs/tags/FOO for some value of FOO) or the basename (e.g. FOO), or
           a suffix of the form heads/FOO or tags/FOO. An unqualified basename is assumed to
           refer to a head.

           When a reset is renamed, commit branch fields matching the tag are renamed with it to
           match. When a reset is deleted, matching branch fields are changed to match the branch
           of the unique descendent of the tip commit of the associated branch, if there is one.
           When a reset is moved, no branch fields are changed.

       debranch source-branch [ target-branch ]
           Takes one or two arguments which must be the names of source and target branches; if
           the second (target) argument is omitted it defaults to refs/heads/master. Any trailing
           segment of a branch name is accepted as a synonym for it; thus master is the same as
           refs/heads/master. Does not take a selection set.

           The history of the source branch is merged into the history of the target branch,
           becoming the history of a subdirectory with the name of the source branch. Any resets
           of the source branch are removed.

       strip [ blobs | reduce ]
           Reduce the selected repository to make it a more tractable test case. Use this when
           reporting bugs.

           With the modifier ‘blobs’, replace each blob in the repository with a small,
           self-identifying stub, leaving all metadata and DAG topology intact. This is useful
           when you are reporting a bug, for reducing large repositories to test cases of
           manageable size.

           A selection set is effective only with the ‘blobs’ option, defaulting to all blobs.
           The ‘reduce’ mode always acts on the entire repository.

           With the modifier ‘reduce’, perform a topological reduction that throws out
           uninteresting commits. If a commit has all file modifications (no deletions or copies
           or renames) and has exactly one ancestor and one descendant, then it may be boring. To
           be fully boring, it must also not be referred to by any tag or reset. Interesting
           commits are not boring, or have a non-boring parent or non-boring child.

           With no modifiers, this command strips blobs.

       ignores [ rename ] [ translate ] [ defaults ]
           Intelligent handling of ignore-pattern files. This command fails if no repository has
           been selected or no preferred write type has been set for the repository. It does not
           take a selection set.

           If the rename modifier is present, this command attempts to rename all ignore-pattern
           files to whatever is appropriate for the preferred type - e.g. .gitignore for git,
           .hgignore for hg, etc. This option does not cause any translation of the ignore files
           it renames.

           If the translate modifier is present, syntax translation of each ignore file is
           attempted. At present, the only transformation the code knows is to prepend a ‘syntax:
           glob’ header if the preferred type is hg.

           If the defaults modifier is present, the command attempts to prepend these default
           patterns to all ignore files. If no ignore file is created by the first commit, it
           will be modified to create one containing the defaults. This command will error out on
           prefer types that have no default ignore patterns (git and hg, in particular). It will
           also error out when it knows the import tool has already set default patterns.

       attribution [ selection ] { show | set | delete | prepend | append } [ args ]
           Inspect, modify, add, and remove commit and tag attributions.

           Attributions upon which to operate are selected in much the same way as events are
           selected, as described in SELECTION SYNTAX. selection is an expression composed of
           1-origin attribution-sequence numbers, ‘$’ for last attribution, ‘..’ ranges,
           comma-separated items, ‘(...)’ grouping, set operations ‘|’ union, ‘&’ intersection,
           and ‘~’ negation, and function calls @min(), @max(), @amp(), @pre(), @suc(), @srt().
           Attributions can also be selected by visibility set ‘=C’ for committers, ‘=A’ for
           authors, and ‘=T’ for taggers. Finally, ‘/regex/’ will attempt to match the regular
           expression regex against an attribution name and email address; ‘/n’ limits the match
           to only the name, and ‘/e’ to only the email address.

           With the exception of ‘show’, all actions require an explicit event selection upon
           which to operate. Available actions are:

           [ selection ] [ show ] [ >file ]
               Inspect the selected attributions of the specified events (commits and tags). The
               ‘show’ keyword is optional. If no attribution selection expression is given,
               defaults to all attributions. If no event selection is specified, defaults to all
               events. Supports > redirection.

           selection set [ name ] [ email ] [ date ]
               Assign name, email, date to the selected attributions. As a convenience, if only
               some fields need to be changed, the others can be omitted. Arguments name, email,
               and date can be given in any order.

           [ selection ] delete
               Delete the selected attributions. As a convenience, deletes all authors if
               selection is not given. It is an error to delete the mandatory committer and
               tagger attributions of commit and tag events, respectively.

           selection prepend [ name ] [ email ] [ date ]
               Insert a new attribution before the first attribution named by selection. The new
               attribution has the same type (committer, author, or tagger) as the one before
               which it is being inserted. Arguments name, email, and date can be given in any
               order.

               If name is omitted, an attempt is made to infer it from email by trying to match
               email against an existing attribution of the event, with preference given to the
               attribution before which the new attribution is being inserted. Similarly, email
               is inferred from an existing matching name. Likewise, for date.

               As a convenience, if selection is empty or not specified a new author is prepended
               to the author list.

               It is presently an error to insert a new committer or tagger attribution. To
               change a committer or tagger, use ‘set’ instead.

           selection append [ name ] [ email ] [ date ]
               Insert a new attribution after the last attribution named by selection. The new
               attribution has the same type (committer, author, or tagger) as the one after
               which it is being inserted. Arguments name, email, and date can be given in any
               order.

               If name is omitted, an attempt is made to infer it from email by trying to match
               email against an existing attribution of the event, with preference given to the
               attribution after which the new attribution is being inserted. Similarly, email is
               inferred from an existing matching name. Likewise, for date.

               As a convenience, if selection is empty or not specified a new author is appended
               to the author list.

               It is presently an error to insert a new committer or tagger attribution. To
               change a committer or tagger, use ‘set’ instead.

       gitify
           Attempt to massage comments into a git-friendly form with a blank separator line after
           a summary line. This code assumes it can insert a blank line if the first line of the
           comment ends with ‘.’, ‘,’, ‘:’, ‘;’, ‘?’, or ‘!’. If the separator line is already
           present, the comment won’t be touched.

           Takes a selection set, defaulting to all commits and tags.

   REFERENCE LIFTING
       This group of commands is meant for fixing up references in commits that are in the format
       of older version control systems. The general workflow is this: first, go over the comment
       history and change all old-fashioned commit references into machine-parseable cookies.
       Then, automatically turn the machine-parseable cookie into action stamps. The point of
       dividing the process this way is that the first part is hard for a machine to get right,
       while the second part is prone to errors when a human does it.

       A Subversion cookie is a comment substring of the form ‘[[SVN:ddddd]]’ (example:
       ‘[[SVN:2355]]’) with the revision read directly via the Subversion exporter, deduced from
       git-svn metadata, or matching a $Revision$ header embedded in blob data for the filename.

       A CVS cookie is a comment substring of the form ‘[[CVS:filename:revision]]’ (example:
       ‘[[CVS:src/README:1.23]]’) with the revision matching a CVS $Id$ or $Revision$ header
       embedded in blob data for the filename.

       A mark cookie is of the form ‘[[:dddd]]’ and is simply a reference to the specified mark.
       You may want to hand-patch this in when one of previous forms is inconvenient.

       An action stamp is an RFC3339 timestamp, followed by a ‘!’, followed by an author email
       address (author is preferred rather than committer because that timestamp is not changed
       when a patch is replayed on to a branch, but the code to make a stamp for a commit will
       fall back to the committer if no author field is present). It attempts to refer to a
       commit without being VCS-specific. Thus, instead of “commit 304a53c2” or “r2355”,
       “2011-10-25T15:11:09Z!fred@foonly.com”.

       The following git aliases allow git to work directly with action stamps. Append it to your
       ~/.gitconfig; if you already have an [alias] section, leave off the first line.

           [alias]
                   # git stamp <commit-ish> - print a reposurgeon-style action stamp
                   stamp = show -s --format='%cI!%ce'

                   # git scommit <stamp> <rev-list-args> - list most recent commit that matches <stamp>.
                   # Must also specify a branch to search or --all, after these arguments.
                   scommit = "!f(){ d=${1%%!*}; a=${1##*!}; arg=\"--until=$d -1\"; if [ $a != $1 ]; then arg=\"$arg --committer=$a\"; fi; shift; git rev-list $arg ${1:+\"$@\"}; }; f"

                   # git scommits <stamp> <rev-list-args> - as above, but list all matching commits.
                   scommits = "!f(){ d=${1%%!*}; a=${1##*!}; arg=\"--until=$d --after $d\"; if [ $a != $1 ]; then arg=\"$arg --committer=$a\"; fi; shift; git rev-list $arg ${1:+\"$@\"}; }; f"

                   # git smaster <stamp> - list most recent commit on master that matches <stamp>.
                   smaster = "!f(){ git scommit \"$1\" master --first-parent; }; f"
                   smasters = "!f(){ git scommits \"$1\" master --first-parent; }; f"

                   # git shs <stamp> - show the commits on master that match <stamp>.
                   shs = "!f(){ stamp=$(git smasters $1); shift; git show ${stamp:?not found} $*; }; f"

                   # git slog <stamp> <log-args> - start git log at <stamp> on master
                   slog = "!f(){ stamp=$(git smaster $1); shift; git log ${stamp:?not found} $*; }; f"

                   # git sco <stamp> - check out most recent commit on master that matches <stamp>.
                   sco = "!f(){ stamp=$(git smaster $1); shift; git checkout ${stamp:?not found} $*; }; f"

       There is a rare case in which an action stamp will not refer uniquely to one commit. It is
       theoretically possible that the same author might check in revisions on different branches
       within the one-second resolution of the timestamps in a fast-import stream. There is
       nothing to be done about this; tools using action stamps need to be aware of the
       possibility and throw a warning when it occurs.

       In order to support reference lifting, reposurgeon internally builds a legacy-reference
       map that associates revision identifiers in older version-control systems with commits.
       The contents of this map comes from three places: (1) cvs2svn:rev properties if the
       repository was read from a Subversion dump stream, (2) $Id$ and $Revision$ headers in
       repository files, and (3) the .git/cvs-revisions created by ‘git cvsimport’.

       The detailed sequence for lifting possible references is this: first, find possible CVS
       and Subversion references with the references or =N visibility set; then replace them with
       equivalent cookies; then run references lift to turn the cookies into action stamps (using
       the information in the legacy-reference map) without having to do the lookup by hand.

       references [ list | edit | lift ] [ >outfile ]
           With the modifier ‘list’, list commit and tag comments for strings that might be CVS-
           or Subversion-style revision identifiers. This will be useful when you want to replace
           them with equivalent cookies that can automatically be translated into VCS-independent
           action stamps. This reporting command supports >-redirection. It is equivalent to ‘=N
           list’.

           With the modifier ‘edit’, edit the set where revision IDs are found. This version of
           the command supports < and > redirection. This is equivalent to ‘=N edit’.

           With the modifier ‘lift’, attempt to resolve Subversion and CVS cookies in comments
           into action stamps using the legacy map. An action stamp is a
           timestamp/email/sequence-number combination uniquely identifying the commit associated
           with that blob, as described in TRANSLATION STYLE.

           It is not guaranteed that every such reference will be resolved, or even that any at
           all will be. Normally all references in history from a Subversion repository will
           resolve, but CVS references are less likely to be resolvable.

   CHANGELOGS
       CVS, Subversion and Mercurial do not have separated notions of committer and author for
       changesets; when lifted to a VCS that does, like git, their one author field is used for
       both.

       However, if the project used the FSF ChangeLog convention, many changesets will include a
       ChangeLog modification listing an author for the commit. In the common case that the
       changeset was derived from a patch and committed by a project maintainer, but the
       ChangeLog entry names the actual author, this information can be recovered.

       Use the ‘changelogs’ command. This takes no arguments, but may take a selection set; the
       default is all commits. It mines the selected ChangeLog files for authorship data.

       An optional following argument is a delimited regular expression to match the basename of
       files that should be treated as changelogs. The default expression is ‘/^ChangeLog$/’.

       It assumes such files are in the format used by FSF projects: entry header lines begin
       with YYYY-MM-DD and are followed by a fullname/address. When a ChangeLog file modification
       is found in a clique, the entry header at or before the section changed since its last
       revision is parsed and the address is inserted as the commit author.

       If the entry header contains an email address but no name, a name will be filled in if
       possible by looking for the address in author map entries.

       In accordance with FSF policy for ChangeLogs, any date in an attribution header is
       discarded and the committer date is used. However, if the name is an author-map alias with
       an associated timezone, that zone is used.

       The Co-Author convention described in the Linux kernel’s co-author message conventions
       <https://git.wiki.kernel.org/index.php/CommitMessageConventions> is observed: If an
       attribution header is followed by a whitespace-led line containing only a valid email
       address. that name becomes the payload of a "Co-Author" header that is appended to the
       change comment for the containing commit.

       The command reports statistics on how many commits were altered.

   RELEASE TARBALLS
       When converting a legacy repository, it sometimes happens that there are archived releases
       of the project surviving from before the date of the repository’s  initial commit. It may
       be desirable to insert those releases at the front of the repository history.

       To do this, use the ‘incorporate’ command. This inserts the contents of specified tarballs
       as commits. The tarball names are given as arguments; if no arguments, a list is read from
       stdin. Tarballs may be gzipped or bzipped. The initial segment of each path is assumed to
       be a version directory and stripped off. The number of segments stripped off can be set
       with the option --strip=<n>, n defaulting to 1.

       Takes a singleton selection set. Normally inserts before that commit; with the option
       --after, insert after it. The default selection set is the very first commit of the
       repository.

       The option --date can be used to set the commit date. It takes an argument, which is
       expected to be an RFC3339 timestamp.

       The generated commits have a committer field (the invoking user) and each gets as date the
       modification time of the newest file in the tarball (not the mod time of the tarball
       itself). No author field is generated. A comment recording the tarball name is generated.

       Note that the import stream generated by this command is - while correct - not optimal,
       and may in particular contain duplicate blobs.

       With the --firewall option, generate an additional commit after the sequence consisting
       only of deletes crafted to prevent the incorporarted content fromm leaking forward.

   VARIABLES AND MACROS
       Occasionally you will need to issue a large number of complex surgical commands of very
       similar form, and it’s convenient to be able to package that form so you don’t need to do
       a lot of error-prone typing. For those occasions, reposurgeon supports simple forms of
       named variables and macro expansion.

       assign [ name ]
           Compute a leading selection set and assign it to a symbolic name. It is an error to
           assign to a name that is already assigned, or to any existing branch name. Assignments
           may be cleared by sequence mutations (though not ordinary deletions); you will see a
           warning when this occurs.

           With no selection set and no name, list all assignments.

           If the option --singleton is given, the assignment will throw an error if the
           selection set is not a singleton.

           Use this to optimize out location and selection computations that would otherwise be
           performed repeatedly, e.g. in macro calls.

       unassign name
           Unassign a symbolic name. Throws an error if the name is not assigned.

       names [ >outfile ]
           List the names of all known branches and tags. Tells you what things are legal within
           angle brackets and parentheses.

       define name body
           Define a macro. The first whitespace-separated token is the name; the remainder of the
           line is the body, unless it is “{”, which begins a multi-line macro terminated by a
           line beginning with “}”.

           A later ‘do’ call can invoke this macro.

           The command ‘define’ by itself without a name or body produces a macro list.

       do name arguments...
           Expand and perform a macro. The first whitespace-separated token is the name of the
           macro to be called; remaining tokens replace {0}, {1}... in the macro definition.
           Tokens may contain whitespace if they are string-quoted; string quotes are stripped.
           Macros can call macros.

           If the macro expansion does not itself begin with a selection set, whatever set was
           specified before the ‘do’ keyword is available to the command generated by the
           expansion.

       undefine name
           Undefine the named macro.

       Here’s an example to illustrate how you might use this. In CVS repositories of projects
       that use the GNU ChangeLog convention, a very common pre-conversion artifact is a commit
       with the comment “*** empty log message ***” that modifies only a ChangeLog entry
       explaining the commit immediately previous to it. The following

           define changelog <{0}> & /empty log message/ squash --pushback
           do changelog 2012-08-14T21:51:35Z
           do changelog 2012-08-08T22:52:14Z
           do changelog 2012-08-07T04:48:26Z
           do changelog 2012-08-08T07:19:09Z
           do changelog 2012-07-28T18:40:10Z

       is equivalent to the more verbose

           <2012-08-14T21:51:35Z> & /empty log message/ squash --pushback
           <2012-08-08T22:52:14Z> & /empty log message/ squash --pushback
           <2012-08-07T04:48:26Z> & /empty log message/ squash --pushback
           <2012-08-08T07:19:09Z> & /empty log message/ squash --pushback
           <2012-07-28T18:40:10Z> & /empty log message/ squash --pushback

       but you are less likely to make difficult-to-notice errors typing the first version.

       (Also note how the text regexp acts as a failsafe against the possibility of typing a
       wrong date that doesn’t refer to a commit with an empty comment. This was a real-world
       example from the CVS-to-git conversion of groff.)

       script filename [ arg... ]
           Takes a filename and optional following arguments. Reads each line from the file and
           executes it as a command.

           During execution of the script, the script name replaces the string $0 and the
           optional following arguments (if any) replace the strings $1, $2 ... $n in the script
           text. This is done before tokenization, so the $1 in a string like ‘foo$1bar’ will be
           expanded. Additionally, $$ is expanded to the current process ID (which may be useful
           for scripts that use tempfiles).

           Within scripts (and only within scripts) reposurgeon accepts a slightly extended
           syntax: First, a backslash ending a line signals that the command continues on the
           next line. Any number of consecutive lines thus escaped are concatenated, without the
           ending backslashes, prior to evaluation. Second, a command that takes an input
           filename argument can instead take literal following data in the syntax of a shell
           here-document. That is: if the ‘<filename’ is replaced by ‘<<EOF’, all following lines
           in the script up to a terminating line consisting only of ‘EOF’ will be read, placed
           in a temporary file, and that file fed to the command and afterwards deleted. EOF may
           be replaced by any string. Backslashes have no special meaning while reading a
           here-document.

           Scripts may have comments. Any line beginning with a ‘#’ is ignored. If a line has a
           trailing position that begins with one or more whitespace characters followed by ‘#’,
           that trailing portion is ignored.

           Scripts may call other scripts to arbitrary depth.

   ARTIFACT REMOVAL
       Some commands automate fixing various kinds of artifacts associated with repository
       conversions from older systems.

       authors [ read | write ] [ <filename ] [ >filename ]
           Apply or dump author-map information for the specified selection set, defaulting to
           all events.

           Lifts from CVS and Subversion may have only usernames local to the repository host in
           committer and author IDs. DVCSes want email addresses (net-wide identifiers) and
           complete names. To supply the map from one to the other, an authors file is expected
           to consist of lines each beginning with a local user ID, followed by a ‘=’ (possibly
           surrounded by whitespace) followed by a full name and email address, optionally
           followed by a timezone offset field. Thus:

               ferd = Ferd J. Foonly <foonly@foo.com> America/New_York

           An authors file may also contain lines of this form

               + Ferd J. Foonly <foonly@foobar.com> America/Los_Angeles

           These are interpreted as aliases for the last preceding ‘=’ entry that may appear in
           ChangeLog files. When such an alias is matched on a ChangeLog attribution line, the
           author attribution for the commit is mapped to the basename, but the timezone is used
           as is. This accommodates people with past addresses (possibly at) different locations)
           unifying such aliases in metadata so searches and statistical aggregation will work
           better.

           An authors file may have comment lines beginning with ‘#’; these are ignored.

           When an authors file is applied, email addresses in committer and author metadata for
           which the local ID matches between < and @ are replaced according to the mapping (this
           handles git-svn lifts). Alternatively, if the local ID is the entire address, this is
           also considered a match (this handles what git-cvsimport and cvs2git do). If a
           timezone was specified in the map entry, that person’s author and committer dates are
           mapped to it.

           With the ‘read’ modifier, or no modifier, apply author mapping data (from standard
           input or a <-redirected file). May be useful if you are editing a repo or dump created
           by cvs2git or by git-svn invoked without -A.

           With the ‘write’ modifier, write a mapping file that could be interpreted by ‘authors
           read’, with entries for each unique committer, author, and tagger (to standard output
           or a >-redirected mapping file). This may be helpful as a start on building an authors
           file, though each part to the right of an equals sign will need editing.

       branchify [ path-set ]
           Specify the list of directories to be treated as potential branches (to become tags if
           there are no modifications after the creation copies) when analyzing a Subversion
           repo. This list is ignored when the ‘--nobranch’ read option is used. It defaults to
           the 'standard layout' set of directories, plus any unrecognized directories in the
           repository root.

           String quotes and backslash escapes are interpreted when parsing the command line.

           With no arguments, displays the current branchification set.

           An asterisk at the end of a path in the set means ‘all immediate subdirectories of
           this path, unless they are part of another (longer) path in the branchify set’.

           Note that the branchify set is a property of the reposurgeon interpreter, not of any
           individual repository, and will persist across Subversion dumpfile reads. This may
           lead to unexpected results if you forget to re-set it.

       branchmap [ /regex/branch/... | reset ]
           Specify the list of regular expressions used for mapping the SVN branches that are
           detected by branchify. If none of the expressions match the default behaviour applies.
           This maps a branch to the name of the last directory, except for trunk and * which are
           mapped to master and root.

           With no arguments the current regex replacement pairs are shown. Passing ‘reset’ will
           clear the mapping.

           String quotes and backslash escapes are not interpreted when parsing the command line,
           this would clash with the use of backslashes as substitution-part references. If you
           need to include a non-printing character in a regexp, use its C-style escape, e.g. \s
           for space.

           The branchify command will match each branch name against regex1 and if it matches
           rewrite its branch name to branch1. If not it will try regex2 and so forth until it
           either found a matching regex or there are no regexes left. The branch name can use Go
           backreferences.

           Note that the regular expressions are appended to ‘refs/’ without either the needed
           ‘heads/’ or ‘tags/’. This allows for choosing the right kind of branch type.

           While the syntax template above uses slashes, any first character will be used as a
           delimiter (and you will need to use a different one in the common case that the paths
           contain slashes).

           You must give this command before the Subversion repository read it is supposed to
           affect! This will not affect any other repository type.

           Note that the branchmap set is a property of the reposurgeon interpreter, not of any
           individual repository, and will persist across Subversion dumpfile or repository
           reads. This may lead to unexpected results if you forget to re-set it.

   EXAMINING TREE STATES
       manifest [ /regular expression/ ] [ >outfile ]
           Takes an optional selection set argument defaulting to all commits, and an optional
           regular expression. For each commit in the selection set, print the mapping of all
           paths in that commit tree to the corresponding blob marks, mirroring what files would
           be created in a checkout of the commit. If a delimited regular expression is given,
           only print "path -> mark" lines for paths matching it. This command supports >
           redirection.

       checkout directory
           Takes a selection set which must resolve to a single commit, and a second argument.
           The second argument is interpreted as a directory name. The state of the code tree at
           that commit is materialized beneath the directory.

       diff [ >outfile ]
           Display the difference between commits. Takes a selection-set argument which must
           resolve to exactly two commits. Supports output redirection.

   HOUSEKEEPING
       These are backed up by the following housekeeping commands, none of which take a selection
       set:

       help
           Get help on the interpreter commands. Optionally follow with whitespace and a command
           name; with no argument, lists all commands. '?' also invokes this.

       shell
           Execute the shell command given in the remainder of the line. '!' also invokes this.

       prefer [ repotype ]
           With no arguments, describe capabilities of all supported systems. With an argument
           (which must be the name of a supported system) this has two effects:

           First, if there are multiple repositories in a directory you do a read on, reposurgeon
           will read the preferred one (otherwise it will complain that it can’t choose among
           them).

           Secondly, this will change reposurgeon’s preferred type for output. This means that
           you do a write to a directory, it will build a repo of the preferred type rather than
           its original type (if it had one).

           If no preferred type has been explicitly selected, reading in a repository (but not a
           fast-import stream) will implicitly set the preferred type to the type of that
           repository.

           In older versions of reposurgeon this command changed the type of the selected
           repository, if there is one. That behavior interacted badly with attempts to interpret
           legacy IDs and has been removed.

       sourcetype [ repotype ]
           Report (with no arguments) or select (with one argument) the current repository’s
           source type. This type is normally set at repository-read time, but may remain unset
           if the source was a stream file.

           The source type affects the interpretation of legacy IDs (for purposes of the =N
           visibility set and the ‘references’ command) by controlling the regular expressions
           used to recognize them. If no preferred output type has been set, it may also change
           the output format of stream files made from the repository.

           The source type is reliably set whenever a live repository is read, or when a
           Subversion stream or Fossil dump is interpreted but not necessarily by other stream
           files. Streams generated by cvs-fast-export(1) using the --reposurgeon option are
           detected as CVS. In some other cases, the source system is detected from the presence
           of magic $-headers in contents blobs.

       gc [ percent ]
           Trigger a garbage collection. Scavenges and removes all blob objects that no longer
           have references, e.g. as a result of delete operqtions on repositories. This is
           followed by a Go-runtime garbage collection.

           The optional argument, if present, is passed as a SetGCPercent
           <https://golang.org/pkg/runtime/debug/#SetGCPercent> call to the Go runtime. The
           initial value is 100; setting it lower causes more frequent garbage collection and may
           reduce maximum working set, while setting it higher causes less frequent garbage
           collection and will raise maximum working set.

   INSTRUMENTATION
       A few commands have been implemented primarily for debugging and regression-testing
       purposes, but may be useful in unusual circumstances.

       The output of most of these commands can individually be redirected to a named output
       file. Where indicated in the syntax, you can prefix the output filename with ‘>’ and give
       it as a following argument.

       index [ >outfile ]
           Display four columns of info on objects in the selection set: their number, their
           type, the associate mark (or ‘-’ if no mark) and a summary field varying by type. For
           a branch or tag it’s the reference; for a commit it’s the commit branch; for a blob
           it’s the repository path of the file in the blob.

           The default selection set for this command is =CTRU, all objects except blobs.

       resolve [ label-text... ]
           Does nothing but resolve a selection-set expression and echo the resulting
           event-number set to standard output. The remainder of the line after the command is
           used as a label for the output.

           Implemented mainly for regression testing, but may be useful for exploring the
           selection-set language.

       log [ logclasses... ]
           Without an argument, list all log message classes, prepending a + if that class is
           enabled and a - if not.

           Otherwise, it expects a space-separated list of ‘<+ or -><log message class>’ entries,
           and enables (with +) or disables (with -) the corresponding log message class. The
           special keyword ‘all’ can be used to affect all the classes at the same time.

           For instance, ‘log -all +shout +warn’ will disable all classes except "shout" and
           "warn", which is the default setting. ‘log +all -svnparse’ would enable logging
           everything but messages from the svn parser.

           You can get a list of other log message classes from ‘help log’.

       logfile [ path ]
           Error, warning, and diagnostic messages are normally emitted to standard error. This
           command, with a nonempty path argument, directs them to the specified file instead.
           Without an argument, reports what logfile is set.

       print output-text...
           Does nothing but ship its argument line to standard output. Useful in regression
           tests.

       version [ version... ]
           With no argument, display the program version and the list of VCSes directly
           supported. With argument, declare the major version (single digit) or full version
           (major.minor) under which the enclosing script was developed. The program will error
           out if the major version has changed (which means the surgical language is not
           backwards compatible).

           It is good practice to start your lift script with a version requirement, especially
           if you are going to archive it for later reference.

       prompt [ format... ]
           Set the command prompt format to the value of the command line; with an empty command
           line, display it. The prompt format is evaluated in after each command with the
           following dictionary substitutions:

           chosen
               The name of the selected repository, or None if none is currently selected.

           Thus, one useful format might be ‘rs[%(chosen)s]%% ’.

           More format items may be added in the future. The default prompt corresponds to the
           format ‘reposurgeon%% ’. The format line is evaluated with shell quoting of tokens, so
           that spaces can be included.

       history
           List the commands you have entered this session.

       legacy [ read | write ] [ <filename ] [ >filename ]
           Apply or list legacy-reference information. Does not take a selection set. The ‘read’
           variant reads from standard input or a <-redirected filename; the ‘write’ variant
           writes to standard output or a >-redirected filename. If neither is specified,
           defaults to ‘read’.

           A legacy-reference file maps reference cookies to (committer, commit-date,
           sequence-number) pairs; these in turn (should) uniquely identify a commit. The format
           is two whitespace-separated fields: the cookie followed by an action stamp identifying
           the commit.

           It should not normally be necessary to use this command. The legacy map is
           automatically preserved through repository reads and rebuilds, being stored in the
           file legacy-map under the repository subdirectory.

       set [ option ]
           Turn on an option flag. With no arguments, list all options.

           Most options are described in conjunction with the specific operations that they
           modify. One of general interest is ‘compressblobs’; this enables compression on the
           blob files in the internal representation reposurgeon uses for editing repositories.
           With this option, reading and writing of repositories is slower, but editing a
           repository requires less (sometimes much less) disk space.

       clear [ option ]
           Turn off an option flag. With no arguments, list all options.

       timing [ >outfile ]
           Display statistics on phase timing in repository analysis. Mainly of interest to
           developers trying to speed up the program.

           If the command has following text, this creates a new, named time mark that will be
           visible in a later report; this may be useful during long-running conversion recipes.

       readlimit [number]
           Set a maximum number of commits to read from a stream. If the limit is reached before
           EOF it will be logged. Mainly useful for benchmarking. Without arguments, report the
           read limit; 0 means there is none.

       memory
           Report memory usage. Runs a garbage-collect before reporting so the figure will better
           reflect storage currently held in loaded repositories; this will not affect the
           reported high-water mark.

       profile [ live | start | save ] [ args... ]
           Profiling is enabled by default, but viewing the profile data requires either starting
           the http server with ‘profile live’, or saving it to a file with ‘profile save’. When
           no arguments are given it prints out the available types of profiles.

       exit
           Exit, reporting the time. Included here because, while EOT will also cleanly exit the
           interpreter, this command reports elapsed time since start.

WORKING WITH MERCURIAL

       There is a built-in extractor class to perform extractions from Mercurial repositories.

       Mercurial branches are exported as branches in the exported repository and tags are
       exported as tags. By default, bookmarks are ignored. You can specify explicit handling for
       bookmarks by setting ‘reposurgeon.bookmarks’ in your .hg/hgrc. Set the value to the prefix
       that reposurgeon should use for bookmarks.

       For example, if your bookmarks represent branches, put this at the bottom of your
       .hg/hgrc:

           [reposurgeon]
           bookmarks=heads/

       If you do that, it’s your responsibility to ensure that branch names do not conflict with
       bookmark names. You can add a prefix like ‘bookmarks=heads/feature-’ to disambiguate as
       necessary.

       Alternatively, you can import directly using hg-git-fast-import
       <https://github.com/kilork/hg-git-fast-import>. This importer is not yet well tested, but
       may be substantially faster than using the extractor harness. You may wish to run test
       conversions using both methods and compare them.

   MERCURIAL SUBREPOSITORIES
       The hg extractor does not attempt to recursively handle subrepos. Rather, it will extract
       the history of the top-level repo, in which .hgsub and .hgsubstate will be treated as
       regular files. If you wish to translate these into the semantics of your target VCS, you
       will need to do so with surgical primitives after reading the history into reposurgeon.

WORKING WITH SUBVERSION

       reposurgeon can read Subversion dumpfiles. You must point it at a repository, not a
       checkout directory.

       The transaction model of Subversion is nothing like that of the DVCSes (distributed
       version control systems) that followed it. Two of the more obvious differences are around
       tags and branches.

       A Subversion tag isn’t an annotation attached to a commit. The Subversion data model is
       that a history is a sequence of surgical operations on a tree; there are no annotation
       tags as such, a tag is just another branch of the tree. Accordingly a Subversion tag is a
       copy of the state of an entire branch at a particular revision. This can be losslessly
       translated to an annotation only if no additional commits are added to the tag branch
       after the copy. But nothing prevents this! reposurgeon tries to do the right thing,
       creating a DVCS-style annotated tag when it can and otherwise preserving the changes as
       commits, using a lightweight tag to point at the tip.

       There is a subtler problem around branches themselves. In a DVCS, deleting a branch
       removes it from the repository history entirely, a fact of some significance since
       repositories are copied around often enough that keeping every discarded experiment
       forever would eventually drown the live content in superannuated cruft. Subversion
       repositories, on the other hand, are designed on the assumption that they sit on one
       server and never move. A Subversion branch is just a directory in the branch namespace; if
       you delete it, you won’t see it in following revisions but if you update to an older one
       that content will still be there. By default, reposurgeon will delete the corresponding
       branches as if the deletion was done in a DVCS, keeping only the commits that are also
       part of other branches' histories, but you can tell it to preserve the branches instead
       and give them unambiguous names in the refs/deleted namespace.

       Bad things can happen when a tag directory is created, copied from, deleted, then
       recreated from a different source directory. This is a place where the Subversion model of
       tags clashes badly with the changeset-DAG model used by git and other DVCSes. Especially
       if the same tag is recreated later! The obvious thing to do when converting this sequence
       would be to just nuke the tag history from the deletion back to its branch point, but that
       will cause problems if a copy operation was ever sourced in the deleted branch (and this
       does happen!).

       What reposurgeon does instead is preserve the most recent branch with any given name, so
       the view back from the repository had and branch tips has correct content. This does
       however mean that the conrtent of any branch with the same ptevious to the visible *most
       rtecent) one is discarded. However, see the --preserve option of the read command.

   READING SUBVERSION REPOSITORIES
       Certain optional modifiers on the read command change its behavior when reading Subversion
       repositories:

       --nobranch
           Suppress branch analysis. The generated git repository will mirror the whole
           subversion tree, with trunk and branches as subdirectories. No directory deletions are
           translated to branch deletions, since no directories are seen as branches in the first
           place.

       --ignore-properties
           Suppress read-time warnings about discarded property settings.

       --user-ignores
           By default reposurgeon filters in-tree .gitignore files found in the history because
           they would clash with those generated from svn:ignore properties. Using this option
           makes .gitignore files be passed through. They will still be overridden by generated
           .gitignore files so this option is often used along with --no-automatic-ignores.

       --use-uuid
           If the --use-uuid read option is set, the repository’s UUID will be used as the
           hostname when faking up email addresses, a la git-svn. Otherwise, addresses will be
           generated the way git cvs-import does it, simply copying the username into the address
           field.

       --no-automatic-ignores
           Do not generate .gitignore files from svn:ignore properties.

       --preserve
           When a branch or tag was deleted in SVN, preserve the history up to deletion in a git
           ref under refs/deleted/, instead of deleting the branch and only keeping the commits
           that are also part of the history of other branches.

       --cvsignores
           Suppress the normal deletion of .cvsignore files.

       These modifiers can go anywhere in any order on the read command line after the read verb.
       They must be whitespace-separated.

       It is also possible to embed a magic comment in a Subversion stream file to set these
       options. Prefix a space-separated list of them with the magic comment ‘ #
       reposurgeon-read-options:’; the leading space is required. This may be useful when
       synthesizing test loads; in particular, a stream file that does not set up a standard
       trunk/branches/tags directory layout can use this to perform a mapping of all commits onto
       the master branch that the git importer will accept.

       Here are the rules used for mapping subdirectories in a Subversion repository to branches:

       •   At any given time there is a set of eligible paths and path wildcards which declare
           potential branches. See the documentation of the branchify command for how to alter
           this set, which initially consists of {trunk, tags/, branches/, and *}.

       •   A repository is considered “flat” if it has no directory that matches a path or path
           wildcard in the branchify set. All commits in a flat repository are assigned to branch
           master, and what would have been branch structure becomes directory structure. In this
           case, we’re done; all the other rules apply to non-flat repos.

       •   If you give the option --nobranch when reading a Subversion repository, branch
           analysis is skipped and the repository is treated as though flat (left as a linear
           sequence of commits on refs/heads/master). This may be useful if your repository
           configuration is highly unusual and you need to do your own branch surgery. Note that
           this option will disable partitioning of mixed commits.

       •   If ‘trunk’ is eligible, it always becomes the master branch.

       •   If an element of the branchify set ends with /*, it is considered a branch namespace:
           each immediate subdirectory of it is considered a potential branch, unless it itself
           appears in branchify as a namespace If * is in the branchify set (which is true by
           default) all top-level directories are also considered potential branches (other than
           /trunk which is mapped to master, /tags, and /branches which are namespaces by
           default).

       •   Files in the top-level directory are assigned to a synthetic branch named ‘root’. If
           there is no "trunk" (or rather no master branch), then this synthetic ‘root’ branch
           becomes the master branch. You can map another directory to master using branchify and
           branchmap.

       •   Each potential branch is checked to see if it has commits on it after the initial
           creation or copy. If there are such commits, or if the branch creation or copy
           introduces changes other than the copy, it becomes a branch. If not, it may become a
           tag in order to preserve the commit metadata. In all cases, the name of any created
           tag or branch is the basename of the directory, unless another mapping is in place.

       Branch-creation operations with no following commits are tagified.

       Otherwise, each commit that only creates or deletes directories (in particular, copy
       commits for tags and branches, and commits that only change properties) will be
       transformed into a tag named after the tag or branch, containing the date/author/comment
       metadata from the commit.

       Subversion branch deletions are turned into deletealls, clearing the fileset of the
       import-stream branch. When a branch finishes with a deleteall at its tip, the deleteall is
       transformed into a tag. This rule cleans up after aborted branch renames.

       Occasionally (and usually by mistake) a branchy Subversion repository will contain
       revisions that touch multiple branches. These are handled by partitioning them into
       multiple import-stream commits, one on each affected branch. The Legacy-ID of such a split
       commit will have a pseudo-decimal part - for example, if Subversion revision 2317 touches
       three branches, the three generated commits will have IDs 2317.1, 2317.2, and 2317.3.

       The svn:executable and svn:special properties are translated into permission settings in
       the input stream; svn:executable becomes 100755 and svn:special becomes 120000 (indicating
       a symlink; the blob contents will be the path to which the symlink should resolve).

       Any cvs2svn:rev properties generated by cvs2svn are incorporated into the internal map
       used for reference-lifting, then discarded.

       Normally, per-directory svn:ignore properties become .gitignore files. Actual .gitignore
       files in a Subversion directory are presumed to have been created by git-svn users
       separately from native Subversion ignore properties and discarded with a warning. It is up
       to the user to merge the content of such files into the target repository by hand. But
       this behavior is changed by the --user-ignores option which disables filtering of in-tree
       .gitignore files, and the --no-automatic-ignores which discards Subversion svn:ignore
       properties without translation.

       Normally, .cvsignore files left over from a Subversion repository’s ancient history as a
       CVS repository are deleted. The assumption is that the repository users want the
       (presumably more up-to-date) Subversion ignore properties to be translated. However, this
       deletion can be prevented with the --cvsignores read option.

       svn:mergeinfo properties are interpreted. Any svn:mergeinfo property on a revision A with
       a merge source containing all revisions on a branch from the forking point (or the branch
       start if the histories are independent) up to revision B produces a merge link such that
       the branch tip at revision B becomes a parent of A. The "svnmerge-integrated" properties
       produced by Subversion’s svmerge.py script are handled the same way.

       All other Subversion properties are discarded. (This may change in a future release.) The
       property for which this is most likely to cause semantic problems is svn:eol-style.
       However, since property-change-only commits get turned into annotated tags, the translated
       tags will retain information about setting changes.

       The sub-second resolution on Subversion commit dates is discarded; Git wants integer
       timestamps only.

       Because fast-import format cannot represent an empty directory, empty directories in
       Subversion repositories will be lost in translation.

       Normally, Subversion local usernames are mapped in the style of git cvs-import; thus user
       ‘foo’ becomes ‘foo <foo>’, which is sufficient to pacify git and other systems that
       require email addresses. With the option svn_use_uuid, usernames are mapped in the git-svn
       style, with the repository’s UUID used as a fake domain in the email address. Both forms
       can be remapped to real address using the authors read command.

       Reading a Subversion stream enables writing of the legacy map as 'legacy-id' passthroughs
       when the repo is written to a stream file.

       reposurgeon tries hard to silently do the right thing, but there are Subversion edge cases
       in which it emits warnings because a human may need to intervene and perform fixups by
       hand. Here are the less obvious messages it may emit:

       user-created .gitignore ignored
           This message means means reposurgeon has found a .gitignore file in the Subversion
           repository it is analyzing. This probably happened because somebody was using git-svn
           as a live gateway, and created ignores which may or may not be congruent with those in
           the generated .gitignore files that the Subversion ignore properties will be
           translated into. You’ll need to make a policy decision about which set of ignores to
           use in the conversion, and possibly set the --user-ignores option on read to pass
           through user-created .gitignore files; in that case this warning will not be emitted.

       properties set
           reposurgeon has detected a setting of a user-defined property, or the Subversion
           properties svn:externals. These properties cannot be expressed in an import stream;
           the user is notified in case this is a showstopper for the conversion or some
           corrective action is required, but normally this error can be ignored. This warning is
           suppressed by the --ignore-properties option.

       Detected link from <revision> to <revision> might be dubious
           When trying to delect parent links from multiple file copies like what cvs2svn can
           produce, source revisions of the different copies were not all the same. The link
           should probably be monitored because it has a non-negligible probability of being
           slightly wrong. This does not impact the tree contents, only the quality of the
           history.

   MID-BRANCH DELETIONS
       When a branch A is deleted and a branch B is copied to the name A, the Subversion intent
       is to replace the contents of branch A with the contents of branch B, keeping the A name.
       This is a poor man’s merge from before "svn merge" existed. Many Subversion users who
       formed their habits before svn merge existed still operate this way.

       In git terms, this almost corresponds to a merge of A into B followed by a rename of B to
       A. Branch B continues to exist, however, so we can’t do that in translation. The
       reposurgeon logic does not try to be clever about this, because "clever" would have
       rebarbative edge cases; the sequence is translated into a deleteall followed by a commit
       operation that recreates the B files under corresponding A names. No merge link is
       created. The commit filling A with a branch copy from B will have B as its first parent,
       though, so all that would be needed is to create a merge link from the old A before the
       delete to the commit recreating A.

       This case is mentioned here because it is likely to confuse the merge-tracking algorithms
       used, e.g., by git diff, or if you ever try to merge a branch that forked off the old A to
       a branch spun off the new (and expect git to know that you do not want to incorporate old
       A’s changes).

   MULTIPROJECT REPOSITORIES
       Subversion repositories are sometimes organized to hold multiple projects, with the root
       directory containing one subdirectory per project and each subdirectory havong its own
       trunk/brances/tags layout.

       Suppose you have a stream dump from a repository with two project subdirectories, project1
       and project2. The pattern for dissecting out project1 looks like this:

           branchify project1/trunk project1/branches/* project1/tags *
           branchmap :project1/trunk:heads/master: :project1/tags:tags: :project1/branches:branches:
           set testmode
           read <multiproject.svn
           branch project2 delete

       The first command branchifies every directory underneath project1 for which that’s
       required, wth project2 left as its own branch from top level. The second command sets up a
       transform of these branches into a standard layout.

       These transformations are performed when the actual read of the repository happens.
       Following that, the unneeded project2 branch can be dropped.

       Of course we could have done the same thing with project2 and dropped project1. Repeat
       this as many times as required to turn each partial into an autonomous git repository.

       While something like this could be done with repocutter sift commands, that would not
       correctly resolve Subversion copies across projectts. This reposurgeon procedurer handles
       those correctly.

IGNORE PATTERNS

       reposurgeon recognizes how supported VCSes represent file ignores (CVS .cvsignore files
       lurking untranslated in older Subversion repositories, Subversion ignore properties,
       .gitignore/.hgignore/.bzrignore file in other systems) and moves ignore declarations among
       these containers on repo input and output. This will be sufficient if the ignore patterns
       are exact filenames.

       Translation may not, however, be perfect when the ignore patterns are Unix glob patterns
       or regular expressions. This compatibility table describes which patterns will translate;
       "plain" indicates a plain filename with no glob or regexp syntax or negation, "no !" means
       no negated regexps, and "no RE:" means the RE prefix for a regular expression does not
       work.

       RCS has no ignore files or patterns and is therefore not included in the table.

       ┌─────────┬──────────┬──────────┬──────────┬─────────┬────────────┬────────────┬──────────┬─────────┐
       │         │          │          │          │         │            │            │          │         │
       │         │ from CVS │ from svn │ from git │ from hg │ from bzr   │ from darcs │ from SRC │ from bk │
       ├─────────┼──────────┼──────────┼──────────┼─────────┼────────────┼────────────┼──────────┼─────────┤
       │         │          │          │          │         │            │            │          │         │
       │to CVS   │ all      │ all      │ no ! &   │ all     │ no RE:, no │ plain      │ all      │ all     │
       │         │          │          │ nonempty │         │ !          │            │          │         │
       ├─────────┼──────────┼──────────┼──────────┼─────────┼────────────┼────────────┼──────────┼─────────┤
       │         │          │          │          │         │            │            │          │         │
       │to svn   │ no !     │ all      │ no !     │ all     │ no RE:. no │ plain      │ all      │ all     │
       │         │          │          │          │         │ !          │            │          │         │
       ├─────────┼──────────┼──────────┼──────────┼─────────┼────────────┼────────────┼──────────┼─────────┤
       │         │          │          │          │         │            │            │          │         │
       │to git   │ all      │ all      │ all      │ no !    │ no RE:     │ plain      │ all      │ all     │
       ├─────────┼──────────┼──────────┼──────────┼─────────┼────────────┼────────────┼──────────┼─────────┤
       │         │          │          │          │         │            │            │          │         │
       │to hg    │ no !     │ all      │ no !     │ all     │ no RE:, no │ plain      │ all      │ all     │
       │         │          │          │          │         │ !          │            │          │         │
       ├─────────┼──────────┼──────────┼──────────┼─────────┼────────────┼────────────┼──────────┼─────────┤
       │         │          │          │          │         │            │            │          │         │
       │to bzr   │ all      │ all      │ all      │ all     │ all        │ plain      │ all      │ all     │
       ├─────────┼──────────┼──────────┼──────────┼─────────┼────────────┼────────────┼──────────┼─────────┤
       │         │          │          │          │         │            │            │          │         │
       │to darcs │ plain    │ plain    │ plain    │ plain   │ plain      │ all        │ all      │ all     │
       ├─────────┼──────────┼──────────┼──────────┼─────────┼────────────┼────────────┼──────────┼─────────┤
       │         │          │          │          │         │            │            │          │         │
       │to SRC   │ no !     │ all      │ no !     │ all     │ no RE:, no │ plain      │ all      │ all     │
       │         │          │          │          │         │ !          │            │          │         │
       └─────────┴──────────┴──────────┴──────────┴─────────┴────────────┴────────────┴──────────┴─────────┘

       The hg rows and columns of the table describes compatibility to hg’s glob syntax rather
       than its default regular-expression syntax. When writing to an hg repository from any
       other kind, reposurgeon prepends to the output .hgignore a ‘syntax: glob’ line.

TRANSLATION STYLE

       After converting a CVS, SVN, or BitKeeper repository, check for and remove $-cookies in
       the head revision(s) of the files. The full Subversion set is $Date:, $Revision:,
       $Author:, $HeadURL and $Id:. CVS uses $Author:, $Date:, $Header:, $Id:, $Log:, $Revision:,
       also (rarely) $Locker:, $Name:, $RCSfile:, $Source:, and $State:.

       When you need to specify a commit, use the action-stamp format that references lift
       generates when it can resolve an SVN or CVS reference in a comment. It is best that you
       not vary from this format, even in trivial ways like omitting the 'Z' or changing the 'T'
       or '!' or ':'. Making action stamps uniform and machine-parseable will have good
       consequences for future repository-browsing tools.

       Sometimes, in converting a repository, you may need to insert an explanatory comment - for
       example, if metadata has been garbled or missing and you need to point to that fact. It’s
       helpful for repository-browsing tools if there is a uniform syntax for this that is highly
       unlikely to show up in repository comments. We recommend enclosing translation notes in [[
       ]]. This has the advantage of being visually similar to the [ ] traditionally used for
       editorial comments in text.

       It is good practice to include, in the comment for the root commit of the repository, a
       note dating and attributing the conversion work and explaining these conventions. Example:

           [[This repository was converted from Subversion to git on 2011-10-24
           by Eric S. Raymond <esr@thyrsus.com>. Here and elsewhere, conversion
           notes are enclosed in double square brackets. Junk commits generated
           by cvs2svn have been removed, commit references have been mapped into
           a uniform VCS-independent syntax, and some comments edited into
           summary-plus-continuation form.]]

       It is also good practice to include a generated tag at the point of conversion. E.g

           msgin --create <<EOF
           Tag-Name: git-conversion

           Marks the spot at which this repository was converted from Subversion to git.
           EOF

ADVANCED EXAMPLES

           define lastchange {
           @max(=B & [/ChangeLog/] & /{0}/B)? list
           }

       List the last commit that refers to a ChangeLog file containing a specified string. (The
       trick here is that ? extends the singleton set consisting of the last eligible ChangeLog
       blob to its set of referring commits, and list only notices the commits.)

           index >index.txt
           shell <index.txt awk '/refs\/tags/ {print $4}' | sort | uniq | while read t; do echo "tag $(basename "$t") rename $(basename "$t" | sed -e 's/sample/example/')"; done >renames.script
           script renames.script

       Mass-rename tags, replacing "sample" on the basename with "example". Illustrates a general
       technique of generating reposurgeon commands via shell that you then execute with the
       ‘script’ command. Enabling this technique is the reason as many commands as possible
       support redirects.

STREAM SYNTAX EXTENSIONS

       The event-stream parser in reposurgeon supports some extended syntax. Exporters designed
       to work with reposurgeon may have a --reposurgeon option that enables emission of extended
       syntax; notably, this is true of cvs-fast-export(1). The remainder of this section
       describes these syntax extensions. The properties they set are (usually) preserved and
       re-output when the stream file is written.

       The token ‘#reposurgeon’ at the start of a comment line in a fast-import stream signals
       reposurgeon that the remainder is an extension command to be interpreted by reposurgeon.

       One such extension command is implemented: ‘#sourcetype’, which behaves identically to the
       reposurgeon sourcetype command. An exporter for a version-control system named "frobozz"
       could, for example, say

           #reposurgeon sourcetype frobozz

       Within a commit, a magic comment of the form ‘#legacy-id’ declares a legacy ID from the
       stream file’s source version-control system.

       Also accepted is the bzr syntax for setting per-commit properties. While parsing commit
       syntax, a line beginning with the token ‘property’ must continue with a
       whitespace-separated property-name token. If it is then followed by a newline it is taken
       to set that boolean-valued property to true. Otherwise it must be followed by a numeric
       token specifying a data length, a space, following data (which may contain newlines) and a
       terminating newline. For example:

           commit refs/heads/master
           mark :1
           committer Eric S. Raymond <esr@thyrsus.com> 1289147634 -0500
           data 16
           Example commit.

           property legacy-id 2 r1
           M 644 inline README

       Unlike other extensions, bzr properties are only preserved on stream output if the
       preferred type is bzr, because any importer other than bzr’s will choke on them.

INCOMPATIBLE LANGUAGE CHANGES

       In versions before 3.23, ‘prefer’ changed the repository type as well as the preferred
       output format.

       In versions before 3.0, the general command syntax put the command verb first, then the
       selection set (if any) then modifiers (VSO). It has changed to optional selection set
       first, then command verb, then modifiers (SVO). The change made parsing simpler, allowed
       abolishing some noise keywords, and recapitulates a successful design pattern in some
       other Unix tools - notably sed(1).

       In versions before 3.0, path expressions only matched commits, not commits and the
       associated blobs as well. The names of the "a" and "c" flags were different.

       In reposurgeon versions before 3.0, the delete command had the semantics of squash; also,
       the policy flags did not require a ‘--’ prefix. The ‘--delete’ flag was named
       "obliterate".

       In reposurgeon versions before 3.0, read and write optionally took file arguments rather
       than requiring redirects (and the write command never wrote into directories). This was
       changed in order to allow these commands to have modifiers. These modifiers replaced
       several global options that no longer exist.

       In reposurgeon versions before 3.0, the earliest factor in a unite command always kept its
       tag and branch names unaltered. The new rule for resolving name conflicts, giving priority
       to the latest factor, produces more natural behavior when uniting two repositories end to
       end; the master branch of the second (later) one keeps its name.

       In reposurgeon versions before 3.0, the tagify command expected policies as trailing
       arguments to alter its behaviour. The new syntax uses similarly named options with leading
       dashes, that can appear anywhere after the tagify command.

       In versions before 2.9. the syntax of authors, legacy, list, and what are now msg{in|out}
       was different (and legacy was fossils). They took plain filename arguments rather than
       using redirect < and >.

       In versions before 4.0, msgin and msgout were named "mailbox_in" and "mailbox_out:";
       branchify was "branchify_map". Previous versions used the Python variant of regular
       expressions; some of the more idiosyncratic features of these are not replicated in the Go
       implementation.

LIMITATIONS AND GUARANTEES

       Guarantee: In DVCses that use commit hashes, editing with reposurgeon never changes the
       hash of a commit object unless (a) you edit the commit, or (b) it is a descendant of an
       edited commit in a VCS that includes parent hashes in the input of a child object’s hash
       (git and hg both do this).

       Guarantee: reposurgeon only requires main memory proportional to the size of a
       repository’s metadata history, not its entire content history. (Exception: the data from
       inline content is held in memory.)

       Guarantee: In the worst case, reposurgeon makes its own copy of every content blob in the
       repository’s history and thus uses intermediate disk space approximately equal to the size
       of a repository’s content history. However, when the repository to be edited is presented
       as a stream file, reposurgeon requires no or only very little extra disk space to
       represent it; the internal representation of content blobs is a (seek-offset, length) pair
       pointing into the stream file.

       Guarantee: reposurgeon never modifies the contents of a repository it reads, nor deletes
       any repository. The results of surgery are always expressed in a new repository.

       Guarantee: Any line in a fast-import stream that is not a part of a command reposurgeon
       parses and understands will be passed through unaltered. At present the set of potential
       passthroughs is known to include the progress, options, and checkpoint commands as well as
       comments led by #.

       Guarantee: All reposurgeon operations either preserve all repository state they are not
       explicitly told to modify or warn you when they cannot do so.

       Guarantee: reposurgeon handles the bzr commit-properties extension, correctly passing
       through property items including those with embedded newlines. (Such properties are also
       editable in the message-box format.)

       Limitation: In Subversion, sufficiently weird sequences of tag creations, branch copies
       from tags, and tag deletions followed by recreations of the tag can confuse reposurgeon,
       causing visible content matches.

       Limitation: Because reposurgeon relies on other programs to generate and interpret the
       fast-import command stream, it is subject to bugs in those programs.

       Limitation: bzr suffers from deep confusion over whether its unit of work is a repository
       or a floating branch that might have been cloned from a repo or created from scratch, and
       might or might not be destined to be merged to a repo one day. Its exporter only works on
       branches, but its importer creates repos. Thus, a rebuild operation will produce a
       subdirectory structure that differs from what you expect. Look for your content under the
       subdirectory ‘trunk’.

       Limitation: under git, signed tags are imported verbatim. However, any operation that
       modifies any commit upstream of the target of the tag will invalidate it.

       Limitation: Stock git (at least as of version 1.7.3.2) will choke on property extension
       commands. Accordingly, reposurgeon omits them when rebuilding a repo with git type.

       Limitation: Converting an hg repo that uses bookmarks (not branches) to git can lose
       information; the branch ref that git assigns to each commit may not be the same as the hg
       bookmark that was active when the commit was originally made under hg. Unfortunately, this
       is a real ontological mismatch, not a problem that can be fixed by cleverness in
       reposurgeon.

       Limitation: Converting an hg repo that uses branches to git can lose information because
       git does not store an explicit branch as part of commit metadata, but colors commits with
       branch or tag names on the fly using a specific coloring algorithm, which might not match
       the explicit branch assignments to commits in the original hg repo. Reposurgeon preserves
       the hg branch information when reading an hg repo, so it is available from within
       reposurgeon itself, but there is no way to preserve it if the repo is written to git.

       Limitation: Not all BitKeeper versions have the fast-import and fast-export commands that
       reposurgeon requires. They are present back to the 7.3 opensource version.

       Limitation: reposurgeon may misbehave under a filesystem which smashes case in filenames,
       or which nominally preserves case but maps names differing only by case to the same
       filesystem node (Mac OS X behaves like this by default). Problems will arise if any two
       paths in a repo differ by case only. To avoid the problem on a Mac, do all your surgery on
       an HFS+ file system formatted with case sensitivity specifically enabled.

       Limitation: If whitespace followed by # appears in a string or regexp command argument, it
       will be misinterpreted as the beginning of a line-ending comment and screw up parsing.

       Guarantee: As version-control systems add support for the fast-import format, their
       repositories will become editable by reposurgeon.

       Limitations described above are unlikely to change. Do ‘help bugs’ at the reposurgeon
       prompt to see up-to-date information on reposurgeon bugs and internal problems that are
       expected to be fixed in some future release.

REQUIREMENTS

       reposurgeon relies on importers and exporters associated with the VCSes it supports.

       git
           Core git supports both export and import.

       bzr
           Requires bzr plus the bzr-fast-import plugin.

       hg
           Requires core hg, the hg-fastimport plugin, and (unless using reposurgeon’s built-in
           hg-extractor) the third-party hg-fast-export.py script.

       svn
           Stock Subversion commands support export and import.

       darcs
           Stock darcs commands support export.

       CVS
           Requires cvs-fast-export. Note that the quality of CVS lifts may be poor, with
           individual lifts requiring serious hand-hacking. This is due to inherent problems with
           CVS’s file-oriented model.

       RCS
           Requires cvs-fast-export (yes, that’s not a typo; cvs-fast-export handles RCS
           collections as well). The caveat for CVS applies.

CRASH RECOVERY

       This section will become relevant only if reposurgeon or something underneath it in the
       software and hardware stack crashes while in the middle of writing out a repository, in
       particular if the target directory of the rebuild is your current directory.

       The tool has two conflicting objectives. On the one hand, we never want to risk clobbering
       a pre-existing repo. On the other hand, we want to be able to run this tool in a directory
       with a repo and modify it in place.

       We resolve this dilemma by playing a game of three-directory monte.

        1. First, we build the repo in a freshly-created staging directory. If your target
           directory is named /path/to/foo, the staging directory will be a peer named
           /path/to/foo-stageNNNN, where NNNN is a cookie derived from reposurgeon’s process ID.

        2. We then make an empty backup directory. This directory will be named /path/to/foo.~N~,
           where N is incremented so as not to conflict with any existing backup directories.
           reposurgeon never, under any circumstances, ever deletes a backup directory.

           So far, all operations are safe; the worst that can happen up to this point if the
           process gets interrupted is that the staging and backup directories get left behind.

        3. The critical region begins. We first move everything in the target directory to the
           backup directory.

        4. Then we move everything in the staging directory to the target.

        5. We finish off by restoring untracked files in the target directory from the backup
           directory. That ends the critical region.

       During the critical region, all signals that can be ignored are ignored.

ERROR RETURNS

       Returns 1 on fatal error, 0 otherwise. In batch mode all errors are fatal.

SEE ALSO

       bzr(1), cvs(1), darcs(1), git(1), hg(1), rcs(1), svn(1). bk(1).

AUTHOR

       Eric S. Raymond <esr@thyrsus.com>; see the project page
       <http://www.catb.org/~esr/reposurgeon>.

                                            2023-01-20                             REPOSURGEON(1)