Ubuntu Manpage: makepp_extending -- How to extend makepp using Perl

NAME

       makepp_extending -- How to extend makepp using Perl

DESCRIPTION

       Makepp internally is flexible enough so that by writing a little bit of Perl code, you can
       add functions or do a number of other operations.

   General notes on writing Perl code to work with makepp
       Each makefile lives in its own package.  Thus definitions in one makefile do not affect
       definitions in another makefile.  A common set of functions including all the standard
       textual manipulation functions is imported into the package when it is created.

       Makefile variables are stored as Perl scalars in that package.  (There are exceptions to
       this: automatic variables and the default value of variables like CC are actually
       implemented as functions with no arguments.  Target specific vars, command line vars and
       environment vars are not seen this way.)  Thus any Perl code you write has access to all
       makefile variables.  Global variables are stored in the "Mpp::global" package.  See
       Makefile variables for the details.

       Each of the statements (ifperl / ifmakeperl, perl / makeperl, sub / makesub), the
       functions (perl / makeperl, map / makemap) and the rule action (perl / makeperl) for
       writing Perl code directly in the makefile come in two flavours.  The first is absolutely
       normal Perl, meaning you have to use the "f_" prefix as explained in the next section, if
       you want to call makepp functions.  The second variant first passes the statement through
       Make-style variable expansion, meaning you have to double the "$"s you want Perl to see.

       End handling is special because makepp's huge (depending on your build system) data
       structures would take several seconds to garbage collect with a normal exit.  So we do a
       brute force exit.  In the main process you can still have "END" blocks but if you have any
       global file handles they may not get flushed.  But you should be using the modern lexical
       filehandles, which get closed properly when going out of scope.

       In Perl code run directly as a rule action or via a command you define, it is the
       opposite.  "END" blocks will not be run, but global filehandles get flushed for you.  The
       "DESTROY" of global objects will never be run.

   Adding new textual functions
       You can add a new function to makepp's repertoire by simply defining a Perl subroutine of
       the same name but with a prefix of "f_".  For example:

           sub f_myfunc {
             my $argument = &arg;      # Name the argument.
             my( undef, $mkfile, $mkfile_line ) = @_; # Name the arguments.

             ... do something here

             return $return_value;
           }

           XYZ := $(myfunc my func arguments)

       If your function takes no arguments, there is nothing to do.  If your function takes one
       argument, as in the example above, use the simple accessor &arg to obtain it.  If you
       expect more arguments, you need the more complex accessor "args" described below.

       These accessors processes the same three parameters that should be passed to any "f_"
       function, namely the function arguments, the makefile object and a line descriptor for
       messages.  Therefore you can use the efficient &arg form in the first case.

       The &arg accessor takes care of the following for you: If the arguments were already
       expanded (e.g. to find the name of the function in "$(my$(function) arg)" the arg is
       passed as a string and just returned.  If the argument still needs expansion, this is the
       usual case, it is instead a reference to a string.  The &arg accessor expands it for you,
       for which it needs the makefile object as its 2nd parameter.

       If you expect more arguments, possibly in variable number, the job is performed by "args".
       This accessor takes the same 3 parameters as arg, plus additional parameters:

       max: number of args (default 2): give ~0 (maxint) for endless
       min: number of args (default 0 if max is ~0, else same as max)
       only_comma: don't eat space around commas, usual for non-filename

       At most max, but at least min commas present before expansion are used to split the
       arguments.  Some examples from makepp's builtin functions:

           my( $prefix, $text ) = args $_[0], $_[1], $_[2], 2, 2, 1; # addprefix
           for my $cond ( args $_[0], undef, $_[2], ~0 ) ... # and, or
           my @args= args $_[0], $_[1], $_[2], ~0, 1, 1; # call
           my( $filters, $words ) = args $_[0], $_[1], $_[2]; # filter

       The function should return a scalar string (not an array) which is then inserted into the
       text at that point.

       If your function encounters an error, it should die using the usual Perl die statement.
       This will be trapped by makepp and an error message displaying the file name and the line
       number of the expression causing the error will be printed out.

       There are essentially no limits on what the function can do; you can access the file, run
       shell commands, etc.

       At present, expressions appearing in dependencies and in the rule actions are expanded
       once while expressions appearing in targets are expanded twice, so be careful if your
       function has side effects and is present in an expression for a target.

       Note that the environment (in particular, the cwd) in which the function evaluates will
       not necessarily match the environment in which the rules from the Makefile in which the
       function was evaluated are executed.  If this is a problem for you, then your function
       probably ought to look something like this:

           sub f_foo {
             ...
             chdir $makefile->{CWD};

             ... etc.
           }

   Putting functions into a Perl module
       If you put functions into an include file, you will have one copy per Makeppfile which
       uses it.  To avoid that, you can write them as a normal Perl module with an "Exporter"
       interface, and use that.  This will load faster and save memory:

           perl { use mymodule }
           perl {
               use my::module;         # put : on a new line so this is not parsed as a rule
           }

       If you need any of the functions normally available in a Makefile (like the "f_"
       functions, "arg" or "args"), you must put this line into your module:

           use Mpp::Subs;

       The drawback is that the module would be in a different package than a function directly
       appearing in a makefile.  So you need to pass in everything as parameters, or construct
       names with Perl's "caller" function.

   Calling external Perl scripts
       If you call an external Perl script via "system", or as a rule action, makepp will fork a
       new process (unless it's the last rule action) and fire off a brand new perl interpreter.
       There's nothing wrong with that, except that there's a more efficient way:

       &command arguments...
           This can be a rule action.  It will call a function command with a "c_" prefix, and
           pass it the remaining (optionally quoted makepp style -- not exactly the same as
           Shell) arguments.  If such a function cannot be found, this passes all strings to
           "run".

               sub c_mycmd { my @args = @_; ... }

               $(phony callcmd):
                   &mycmd 'arg with space' arg2 "arg3" # calls c_mycmd

               %.out: %.in
                   &myscript -o $(output) $(input) # calls external myscript

           You can write your commands within the framework of the builtins, allowing you to use
           the same standard options as they have, and the I/O handling they give.

           The block operator "Mpp::Cmds::frame" is followed by a single letter option list of
           the builtins (maximally "qw(f i I o O r s)").  Even if you specify your own option
           overriding one of these, you still give the single letter of the standard option.

           Each own option is specified as "[qw(n name), \$ref, arg, sub]".  The first two
           elements are short and long name, followed by the variable reference and optionally by
           a boolean for whether to take an argument.  Without an arg, the variable is
           incremented each time the option is given, else the option value is stored in it.

               sub c_my_ocmd {             # Typical output case
                 local @ARGV = @_;
                 Mpp::Cmds::frame {

                   ... print something here with @ARGV, with options already automatically removed

                 } 'f', qw(o O);
               }

               sub c_my_icmd {             # Typical input case with 2 options
                 local @ARGV = @_;
                 my( $short, $long );
                 Mpp::Cmds::frame {

                   ... do something here with <>

                 } qw(i I r s),            # s specifies only --separator, not -s
                   [qw(s short), \$short], # No option arg -> $short == 1
                   [qw(l long), \$long, 1, sub { warn "got arg $long"}];
               }

           Here comes a simple command which upcases only the first character of each input
           record (equivalent to "&sed '$$_ = "\u\L$$_"'"):

               sub c_uc {
                 local @ARGV = @_;
                 Mpp::Cmds::frame {
                   print "\u\L$_" while <>;
                 } 'f', qw(i I o O r s);
               }

           Within the block handled by frame, you can have nested blocks for performing critical
           operations, like opening other files.

               Mpp::Cmds::perform { ... } 'message';

           This will output message with "--verbose" (which every command accepts) iff the
           command is successfully run.  But if the block evaluates as false, it dies with
           negated message.

       run script arguments...
           This is a normal Perl function you can use in any Perl context within your makefile.
           It is similar to the multi-argument form of system, but it runs the Perl script within
           the current process.  For makepp statements, the perl function or your own functions
           that is the process running makepp.  But for a rule that is the subprocess performing
           it.  The script gets parsed as many times as it gets called, but you can put the real
           work into a lib, as pod2html does.  This lib can then get used in the top level, so
           that it's already present:

               perl { use mylib }          # gets forked to all rules which needn't reparse it

               %.out: %.in
                   makeperl { run qw'myscript -o $(output) $(input)' }

           If the script calls "exit", closes standard file descriptors or relies on the system
           to clean up after it (open files, memory...), this can be a problem with "run".  If
           you call "run" within statements or the perl function, makepp can get disturbed or the
           cleanup only happens at the end of makepp.

           If you have one the aforementioned problems, run the script externally, i.e. as from
           the command line instead.  Within a rule cleanup is less of a problem, especially not
           as the last action of a rule, since the rule subprocess will exit afterwards anyway,
           except on Windows.

   Writing your own signature methods
       Sometimes you want makepp to compute a signature method using a different technique.  For
       example, suppose you have a binary that depends on a shared library.  Ordinarily, if you
       change the shared library, you don't have to relink executables that depend on it because
       the linking is done at run time.  (However, it is possible that relinking the executable
       might be necessary, which is why I did not make this the default.)  What you want makepp
       to do is to have the same signature for the shared library even if it changes.

       This can be accomplished in several ways.  The easiest way is to create your own new
       signature method (let's call it "shared_object").  You would use this signature method
       only on rules that link binaries, like this:

           myprogram : *.o lib1/lib1.so lib2/lib2.so
               : signature shared_object
               $(CC) $(inputs) -o $(output)

       Now we have to create the signature method.

       All signature methods must be their own class, and the class must contain a few special
       items (see Mpp/Signature.pm in the distribution for details).  The class's name must be
       prefixed with "Mpp::Signature::", so in this case our class should be called
       "Mpp::Signature::shared_object".  We have to create a file called shared_object.pm and put
       it into a Mpp::Signature directory somewhere in the Perl include path; the easiest place
       might be in the Mpp/Signature directory in the makepp installation (e.g.,
       /usr/local/share/makepp/Mpp/Signature or wherever you installed it).

       For precise details about what has to go in this class, you should look carefully through
       the file Mpp/Signature.pm and probably also Mpp/Signature/exact_match.pm in the makepp
       distribution.  But in our case, all we want to do is to make a very small change to an
       existing signature mechanism; if the file is a shared library, we want to have a constant
       signature, whereas if the file is anything else, we want to rely on makepp's normal
       signature mechanism.  The best way to do this is to inherit from
       "Mpp::Signature::c_compilation_md5", which is the signature method that is usually chosen
       when makepp recognizes a link command.

       So the file Mpp/Signature/shared_object.pm might contain the following:

           use strict;
           package Mpp::Signature::shared_object;
           use Mpp::Signature::c_compilation_md5;
           our @ISA = qw(Mpp::Signature::c_compilation_md5); # Indicate inheritance.
           our $shared_object = bless \@ISA; # A piece of magic that helps makepp find
                                       # the subroutines for this method.  All
                                       # signature methods must have one of these.
                                       # The value is not used, just any object.
           # Now here's the method that gets called when we need the signature of
           # any target or dependency for which this signature method is active:
           sub signature {
             my ($self,                 # This will be the same as $shared_object.
                 $finfo) = @_;          # A special structure that contains everything
                                        # makepp knows about this file.  See
                                        # Mpp/File.pm for details.

             if ($finfo->{NAME} =~ /\.s[oa]$/) { # Does the file name end in .so or .sa?
               return $finfo->file_exists ? 'exists' : '';
                                        # Always return the same signature if the file
                                        # exists.  In this case, the signature is the
                                        # string "exists".
             }

             Mpp::Signature::c_compilation_md5::signature;
                                        # If the file didn't end in .so or .sa,
                                        # delegate to makepp's usual signature method.
           }

       This file is provided as an example in the makepp distribution, with some additional
       comments.

       Incidentally, why don't we make this the default?  Well, there are times when changing a
       shared library will require a relinking of your program.  If you ever change either the
       symbols that a shared library defines, or the symbols that it depends on other libraries
       for, a relink may sometimes be necessary.

       Suppose, for example, that the shared library invokes some subroutines that your program
       provides.  E.g., suppose you change the shared library so it now calls an external
       subroutine "xyz()".  Unless you use the "-E" or "--export-dynamic" option to the linker
       (for GNU binutils; other linkers have different option names), the symbol "xyz()" may not
       be accessible to the run-time linker even if it exists in your program.

       Even worse, suppose you defined "xyz()" in another library (call it libxyz), like this:

           my_program: main.o lib1/lib1.so xyz/libxyz.a

       Since "libxyz" is a .a file and not a .so file, then "xyz()" may not be pulled in
       correctly from libxyz.a unless you relink your binary.

       Mpp::Signature methods also control not only the string that is used to determine if a
       file has changed, but the algorithm that is used to compare the strings.  For example, the
       signature method "target_newer" in the makepp distribution merely requires that the
       targets be newer than the dependencies, whereas the signature method "exact_match" (and
       everything that depends on it, such as "md5" and "c_compilation_md5") requires that the
       file have the same signature as on the last build.

       Here are some other kinds of signature methods that might be useful, to help you realize
       the possibilities.  If general purpose enough, some of these may eventually be
       incorporated into makepp:

       •   A signature method for shared libraries that returns a checksum of all the exported
           symbols, and also all the symbols that it needs from other libraries.  This solves the
           problem with the example above, and guarantees a correct link under all circumstances.
           An experimental attempt has been made to do this in the makepp distribution (see
           Mpp/Signature/shared_object.pm), but it will only work with GNU binutils and ELF
           libraries at the moment.

       •   A signature method that ignores a date stamp written into a file.  E.g., if you
           generate a .c file automatically using some program that insists on putting a string
           in like this:

               static char * date_stamp = "Generated automatically on 01 Apr 2004 by nobody";

           you could write a signature method that specifically ignores changes in date stamps.
           Thus if the date stamp is the only thing that has changed, makepp will not rebuild.

       •   A signature method that computes the signatures the normal way, but ignores the
           architecture dependence when deciding whether to rebuild.  This could be useful for
           truly architecture-independent files; currently if you build on one architecture,
           makepp will insist on rebuilding even architecture-independent files when you switch
           to a different architecture.

       •   A signature method that knows how to ignore comments in latex files, as the
           "c_compilation_md5" method knows how to ignore comments in C files.

       •   A signature method for automatic documentation extraction that checksums only to the
           comments that a documentation extractor needs and ignores other changes to the source
           file.

   Unfinished
       This document is not finished yet.  It should cover how to write your own scanners for
       include files and things like that.