Ubuntu Manpage: funcalc - Funtools calculator (for binary tables)

NAME

       funcalc - Funtools calculator (for binary tables)

SYNOPSIS

       funcalc [-n] [-a argstr] [-e expr] [-f file] [-l link] [-p prog] <iname> [oname [columns]]

OPTIONS

         -a argstr    # user arguments to pass to the compiled program
         -e expr      # funcalc expression
         -f file      # file containing funcalc expression
         -l libs      # libs to add to link command
         -n           # output generated code instead of compiling and executing
         -p prog      # generate named program, no execution
         -u           # die if any variable is undeclared (don't auto-declare)

DESCRIPTION

       funcalc  is  a  calculator  program  that  allows  arbitrary expressions to be constructed, compiled, and
       executed on columns in a Funtools table (FITS binary table or raw event file). It  works  by  integrating
       user-supplied  expression(s) into a template C program, then compiling and executing the program. funcalc
       expressions are C statements, although some important simplifications (such as automatic  declaration  of
       variables) are supported.

       funcalc expressions can be specified in three ways: on the command line using the -e [expression] switch,
       in a file using the -f [file] switch, or from stdin (if neither -e nor -f is specified). Of course a file
       containing funcalc expressions can be read from stdin.

       Each  invocation  of  funcalc  requires an input Funtools table file to be specified as the first command
       line argument.  The output Funtools table file is the second optional argument. It is needed only  if  an
       output  FITS  file  is  being created (i.e., in cases where the funcalc expression only prints values, no
       output file is needed). If input and output file are  both  specified,  a  third  optional  argument  can
       specify  the  list  of  columns  to  activate  (using FunColumnActivate()).  Note that funcalc determines
       whether or not to generate code for writing an output file based on the presence or absence of an  output
       file argument.

       A  funcalc  expression  executes  on  each  row  of a table and consists of one or more C statements that
       operate on the columns of that row (possibly using temporary variables).  Within an expression, reference
       is made to a column of the current row using the C struct syntax cur-[colname]>, e.g.  cur->x,  cur->pha,
       etc.  Local scalar variables can be defined using C declarations at very the beginning of the expression,
       or else they can be defined automatically by funcalc (to be of type double). Thus, for example, a swap of
       columns x and y in a table can be performed using either of the following equivalent funcalc expressions:

         double temp;
         temp = cur->x;
         cur->x = cur->y;
         cur->y = temp;

       or:

         temp = cur->x;
         cur->x = cur->y;
         cur->y = temp;

       When this expression is executed using a command such as:

         funcalc -f swap.expr itest.ev otest.ev

       the resulting file will have values of the x and y columns swapped.

       By  default,  the  data  type  of the variable for a column is the same as the data type of the column as
       stored in the file. This can be changed by appending ":[dtype]" to the first reference to that column. In
       the example above, to force x and y to be output as doubles, specify the type 'D' explicitly:

         temp = cur->x:D;
         cur->x = cur->y:D;
         cur->y = temp;

       Data type specifiers follow standard FITS table syntax for defining columns using TFORM:

       •   A: ASCII characters

       •   B: unsigned 8-bit char

       •   I: signed 16-bit int

       •   U: unsigned 16-bit int (not standard FITS)

       •   J: signed 32-bit int

       •   V: unsigned 32-bit int (not standard FITS)

       •   E: 32-bit float

       •   D: 64-bit float

       •   X: bits (treated as an array of chars)

       Note that only the first reference to a column should contain the explicit data type specifier.

       Of course, it is important to handle the data type of the columns correctly.  One of  the  most  frequent
       cause  of  error  in  funcalc  programming  is  the  implicit  use of the wrong data type for a column in
       expression.  For example, the calculation:

         dx = (cur->x - cur->y)/(cur->x + cur->y);

       usually needs to be performed using floating point arithmetic. In cases where the x  and  y  columns  are
       integers, this can be done by reading the columns as doubles using an explicit type specification:

         dx = (cur->x:D - cur->y:D)/(cur->x + cur->y);

       Alternatively, it can be done using C type-casting in the expression:

         dx = ((double)cur->x - (double)cur->y)/((double)cur->x + (double)cur->y);

       In addition to accessing columns in the current row, reference also can be made to the previous row using
       prev-[colname]>, and to the next row using next-[colname]>.  Note that if prev-[colname]> is specified in
       the  funcalc  expression,  the  very  first row is not processed.  If next-[colname]> is specified in the
       funcalc expression, the very last row is not processed. In this way, prev and next are guaranteed  always
       to  point to valid rows.  For example, to print out the values of the current x column and the previous y
       column, use the C fprintf function in a funcalc expression:

         fprintf(stdout, "%d %d\n", cur->x, prev->y);

       New columns can be specified using the same cur-[colname]> syntax  by  appending  the  column  type  (and
       optional  tlmin/tlmax/binsiz  specifiers), separated by colons. For example, cur->avg:D will define a new
       column of type double. Type specifiers are the same those used  above  to  specify  new  data  types  for
       existing columns.

       For  example,  to  create and output a new column that is the average value of the x and y columns, a new
       "avg" column can be defined:

         cur->avg:D = (cur->x + cur->y)/2.0

       Note that the final ';' is not required for single-line expressions.

       As with FITS TFORM data type specification, the column data type specifier can be preceded by  a  numeric
       count  to  define  an array, e.g., "10I" means a vector of 10 short ints, "2E" means two single precision
       floats, etc.  A new column only needs to be defined once in a funcalc expression, after which it  can  be
       used without re-specifying the type. This includes reference to elements of a column array:

         cur->avg[0]:2D = (cur->x + cur->y)/2.0;
         cur->avg[1] = (cur->x - cur->y)/2.0;

       The  'X'  (bits)  data  type  is  treated  as  a  char array of dimension (numeric_count/8), i.e., 16X is
       processed as a 2-byte char array. Each 8-bit array element is accessed separately:

         cur->stat[0]:16X  = 1;
         cur->stat[1]      = 2;

       Here, a 16-bit column is created with the MSB is set to 1 and the LSB set to 2.

       By default, all processed rows are written to the specified output file. If  you  want  to  skip  writing
       certain  rows,  simply execute the C "continue" statement at the end of the funcalc expression, since the
       writing of the row is performed immediately after the  expression  is  executed.  For  example,  to  skip
       writing rows whose average is the same as the current x value:

         cur->avg[0]:2D = (cur->x + cur->y)/2.0;
         cur->avg[1] = (cur->x - cur->y)/2.0;
         if( cur->avg[0] == cur->x )
           continue;

       If no output file argument is specified on the funcalc command line, no output file is opened and no rows
       are  written.  This is useful in expressions that simply print output results instead of generating a new
       file:

         fpv = (cur->av3:D-cur->av1:D)/(cur->av1+cur->av2:D+cur->av3);
         fbv =  cur->av2/(cur->av1+cur->av2+cur->av3);
         fpu = ((double)cur->au3-cur->au1)/((double)cur->au1+cur->au2+cur->au3);
         fbu =  cur->au2/(double)(cur->au1+cur->au2+cur->au3);
         fprintf(stdout, "%f\t%f\t%f\t%f\n", fpv, fbv, fpu, fbu);

       In the above example, we use both explicit type specification (for "av" columns) and  type  casting  (for
       "au" columns) to ensure that all operations are performed in double precision.

       When an output file is specified, the selected input table is processed and output rows are copied to the
       output file.  Note that the output file can be specified as "stdout" in order to write the output rows to
       the  standard  output.   If  the  output  file argument is passed, an optional third argument also can be
       passed to specify which columns to process.

       In a FITS binary table, it sometimes is desirable to copy all of the other FITS extensions to the  output
       file  as  well.  This  can be done by appending a '+' sign to the name of the extension in the input file
       name. See funtable for a related example.

       funcalc works by integrating the user-specified expression into a template C  program  called  tabcalc.c.
       The  completed  program  then  is  compiled  and  executed.  Variable declarations that begin the funcalc
       expression are placed in the local declaration section of the template main program.  All other lines are
       placed in the template main program's inner processing loop. Other  details  of  program  generation  are
       handled  automatically.  For  example,  column specifiers are analyzed to build a C struct for processing
       rows, which is passed to FunColumnSelect() and used in FunTableRowGet().  If an unknown variable is  used
       in  the  expression,  resulting  in  a compilation error, the program build is retried after defining the
       unknown variable to be of type double.

       Normally, funcalc expression code is added to funcalc row processing loop. It is possible to add code  to
       other parts of the program by placing this code inside special directives of the form:

         [directive name]
           ... code goes here ...
         end

       The directives are:

       •   global add code and declarations in global space, before the main routine.

       •   local add declarations (and code) just after the local declarations in main

       •   before add code just before entering the main row processing loop

       •   after add code just after exiting the main row processing loop

       Thus,  the  following  funcalc  expression  will  declare global variables and make subroutine calls just
       before and just after the main processing loop:

         global
           double v1, v2;
           double init(void);
           double finish(double v);
         end
         before
           v1  = init();
         end
         ... process rows, with calculations using v1 ...
         after
           v2 = finish(v1);
           if( v2 < 0.0 ){
             fprintf(stderr, "processing failed %g -> %g\n", v1, v2);
             exit(1);
           }
         end

       Routines such as init() and finish() above are passed to the generated program for linking using  the  -l
       [link directives ...]  switch. The string specified by this switch will be added to the link line used to
       build  the  program  (before the funtools library). For example, assuming that init() and finish() are in
       the library libmysubs.a in the /opt/special/lib directory, use:

         funcalc  -l "-L/opt/special/lib -lmysubs" ...

       User arguments can be passed to a compiled funcalc program using a string argument to  the  "-a"  switch.
       The string should contain all of the user arguments. For example, to pass the integers 1 and 2, use:

         funcalc -a "1 2" ...

       The  arguments  are  stored  in an internal array and are accessed as strings via the ARGV(n) macro.  For
       example, consider the following expression:

         local
           int pmin, pmax;
         end

         before
           pmin=atoi(ARGV(0));
           pmax=atoi(ARGV(1));
         end

         if( (cur->pha >= pmin) && (cur->pha <= pmax) )
           fprintf(stderr, "%d %d %d\n", cur->x, cur->y, cur->pha);

       This expression will print out x, y, and pha values for all rows in which the pha value  is  between  the
       two user-input values:

         funcalc -a '1 12' -f foo snr.ev'[cir 512 512 .1]'
         512 512 6
         512 512 8
         512 512 5
         512 512 5
         512 512 8

         funcalc -a '5 6' -f foo snr.ev'[cir 512 512 .1]'
         512 512 6
         512 512 5
         512 512 5

       Note  that it is the user's responsibility to ensure that the correct number of arguments are passed. The
       ARGV(n) macro returns a NULL if a requested argument is outside the limits of the actual number of  args,
       usually resulting in a SEGV if processed blindly.  To check the argument count, use the ARGC macro:

         local
           long int seed=1;
           double limit=0.8;
         end

         before
           if( ARGC >= 1 ) seed = atol(ARGV(0));
           if( ARGC >= 2 ) limit = atof(ARGV(1));
           srand48(seed);
         end

         if ( drand48() > limit ) continue;

       The  macro  WRITE_ROW expands to the FunTableRowPut() call that writes the current row. It can be used to
       write the row more than once.  In addition, the macro NROW expands to  the  row  number  currently  being
       processed. Use of these two macros is shown in the following example:

         if( cur->pha:I == cur->pi:I ) continue;
         a = cur->pha;
         cur->pha = cur->pi;
         cur->pi = a;
         cur->AVG:E  = (cur->pha+cur->pi)/2.0;
         cur->NR:I = NROW;
         if( NROW < 10 ) WRITE_ROW;

       If the -p [prog] switch is specified, the expression is not executed. Rather, the generated executable is
       saved with the specified program name for later use.

       If  the  -n switch is specified, the expression is not executed. Rather, the generated code is written to
       stdout. This is especially useful if you want to generate a skeleton file and add your own  code,  or  if
       you need to check compilation errors. Note that the comment at the start of the output gives the compiler
       command  needed  to build the program on that platform. (The command can change from platform to platform
       because of the use of different libraries, compiler switches, etc.)

       As mentioned previously, funcalc will declare a scalar variable  automatically  (as  a  double)  if  that
       variable  has  been  used  but  not  declared.   This  facility  is  implemented using a sed script named
       funcalc.sed, which processes the compiler output to sense an undeclared variable error.  This script  has
       been  seeded  with  the  appropriate  error information for gcc, and for cc on Solaris, DecAlpha, and SGI
       platforms. If you find that automatic declaration of scalars is not working on your platform, check  this
       sed script; it might be necessary to add to or edit some of the error messages it senses.

       In order to keep the lexical analysis of funcalc expressions (reasonably) simple, we chose to accept some
       limitations  on  how accurately C comments, spaces, and new-lines are placed in the generated program. In
       particular, comments associated with local variables declared at the beginning of  an  expression  (i.e.,
       not in a local...end block) will usually end up in the inner loop, not with the local declarations:

         /* this comment will end up in the wrong place (i.e, inner loop) */
         double a; /* also in wrong place */
         /* this will be in the the right place (inner loop) */
         if( cur->x:D == cur->y:D ) continue; /* also in right place */
         a = cur->x;
         cur->x = cur->y;
         cur->y = a;
         cur->avg:E  = (cur->x+cur->y)/2.0;

       Similarly,  spaces  and  new-lines  sometimes  are  omitted  or added in a seemingly arbitrary manner. Of
       course, none of these stylistic blemishes affect the correctness of the generated code.

       Because funcalc must analyze the user expression using the data file(s) passed on the command  line,  the
       input file(s) must be opened and read twice: once during program generation and once during execution. As
       a result, it is not possible to use stdin for the input file: funcalc cannot be used as a filter. We will
       consider removing this restriction at a later time.

       Along  with C comments, funcalc expressions can have one-line internal comments that are not passed on to
       the generated C program. These internal comment start with  the  #  character  and  continue  up  to  the
       new-line:

         double a; # this is not passed to the generated C file
         # nor is this
         a = cur->x;
         cur->x = cur->y;
         cur->y = a;
         /* this comment is passed to the C file */
         cur->avg:E  = (cur->x+cur->y)/2.0;

       As previously mentioned, input columns normally are identified by their being used within the inner event
       loop.  There  are  rare cases where you might want to read a column and process it outside the main loop.
       For example, qsort might use a column in its sort comparison routine that is  not  processed  inside  the
       inner loop (and therefore not implicitly specified as a column to be read).  To ensure that such a column
       is  read by the event loop, use the explicit keyword.  The arguments to this keyword specify columns that
       should be read into the input record structure even though they are not mentioned in the inner loop.  For
       example:

         explicit pi pha

       will  ensure  that  the  pi  and pha columns are read for each row, even if they are not processed in the
       inner event loop. The explicit statement can be placed anywhere.

       Finally, note that funcalc currently works on expressions involving FITS  binary  tables  and  raw  event
       files.  We  will  consider  adding support for image expressions at a later point, if there is demand for
       such support from the community.

NAME

SYNOPSIS

OPTIONS

DESCRIPTION

SEE ALSO