Provided by: makepp_2.0.98.5-2_all bug

NAME

       makepp_tutorial_compilation -- Unix compilation commands

DESCRIPTION

       Skip this this manual page if you have a good grasp on what the compilation commands do.

       I find that distressingly few people seem to be taught in their programming classes is how
       to go about compiling programs once they've written them.  Novices rely either on a single
       memorized command, or else on the builtin rules in make.  I have been surprised by
       extremely computer literate people who learned to compile without optimization because
       they simply never were told how important it is.  Rudimentary knowledge of how compilation
       commands work may make your programs run twice as fast or more, so it's worth at least
       five minutes.  This page describes just about everything you'll need to know to compile C
       or C++ programs on just about any variant of Unix.

       The examples will be mostly for C, since C++ compilation is identical except that the name
       of the compiler is different.  Suppose you're compiling source code in a file called
       "xyz.c" and you want to build a program called "xyz".  What must happen?

       You may know that you can build your program in one step, using a command like this:

           cc -g xyz.c -o xyz

       This will work, but it conceals a two-step process that you must understand if you are
       writing makefiles.  (Actually, there are more than two steps, but you only have to
       understand two of them.)  For a program of more than one module, the two steps are usually
       explicitly separated.

   Compilation
       The first step is the translation of your C or C++ source code into a binary file called
       an object file.  Object files usually have an extension of ".o". (For some more recent
       projects, ".lo" is also used for a slightly different kind of object file.)

       The command to produce an object file on Unix looks something like this:

           cc -g -c xyz.c -o xyz.o

       "cc" is the C compiler.  Sometimes alternate C compilers are used; a very common one is
       called "gcc".  A common C++ compiler is the GNU compiler, usually called "g++".  Virtually
       all C and C++ compilers on Unix have the same syntax for the rest of the command (at least
       for basic operations), so the only difference would be the first word.

       We'll explain what the "-g" option does later.

       The "-c" option tells the C compiler to produce a ".o" file as output.  (If you don't
       specify "-c", then it performs the second compilation step automatically.)

       The "-o xyz.o" option tells the compiler what the name of the object file is.  You can
       omit this, as long as the name of the object file is the same as the name of the source
       file except for the ".o" extension.

       For the most part, the order of the options and the file names does not matter.  One
       important exception is that the output file must immediately follow "-o".

   Linking
       The second step of building a program is called linking.  An object file cannot be run
       directly; it's an intermediate form that must be linked to other components in order to
       produce a program.  Other components might include:

       •   Libraries.  A library, roughly speaking, is a collection of object modules that are
           included as necessary.  For example, if your program calls the "printf" function, then
           the definition of the "printf" function must be included from the system C library.
           Some libraries are automatically linked into your program (e.g., the one containing
           "printf") so you never need to worry about them.

       •   Object files derived from other source files in your program.  If you write your
           program so that it actually has several source files, normally you would compile each
           source file to a separate object file and then link them all together.

       The linker is the program responsible for taking a collection of object files and
       libraries and linking them together to produce an executable file.  The executable file is
       the program you actually run.

       The command to link the program looks something like this:

           cc -g xyz.o -o xyz

       It may seem odd, but we usually run the same program ("cc") to perform the linking.  What
       happens under the surface is that the "cc" program immediately passes off control to a
       different program (the linker, sometimes called the loader, or "ld") after adding a number
       of complex pieces of information to the command line.  For example, "cc" tells "ld" where
       the system library is that includes the definition of functions like "printf".  Until you
       start writing shared libraries, you usually do not need to deal directly with "ld".

       If you do not specify "-o xyz", then the output file will be called "a.out", which seems
       to me to be a completely useless and confusing convention.  So always specify "-o" on the
       linking step.

       If your program has more than one object file, you should specify all the object files on
       the link command.

   Why you need to separate the steps
       Why not just use the simple, one-step command, like this:

           cc -g xyz.c -o xyz

       instead of the more complicated two-stage compilation

           cc -g -c xyz.c -o xyz.o
           cc -g xyz.o -o xyz

       if internally the first is converted into the second?  The difference is important only if
       there is more than one module in your program.  Suppose we have an additional module,
       "abc.c".  Now our compilation looks like this:

           # One-stage command.
           cc -g xyz.c abc.c -o xyz

       or

           # Two-stage command.
           cc -g -c xyz.c -o xyz.o
           cc -g -c abc.c -o abc.o
           cc -g xyz.o abc.o -o xyz

       The first method, of course, is converted internally into the second method.  This means
       that both "xyz.c" and "abc.c" are recompiled each time the command is run.  But if you
       only changed "xyz.c", there's no need to recompile "abc.c", so the second line of the two-
       stage commands does not need to be done.  This can make a huge difference in compilation
       time, especially if you have many modules.  For this reason, virtually all makefiles keep
       the two compilation steps separate.

       That's pretty much the basics, but there are a few more little details you really should
       know about.

   Debugging vs. optimization
       Usually programmers compile a program either either for debug or for speed.  Compilation
       for speed is called optimization; compiling with optimization can make your code run up to
       5 times faster or more, depending on your code, your processor, and your compiler.

       With such dramatic gains possible, why would you ever not want to optimize?  The most
       important answer is that optimization makes use of a debugger much more difficult
       (sometimes impossible).  (If you don't know anything about a debugger, it's time to learn.
       The half hour or hour you'll spend learning the basics will be repaid many many times over
       in the time you'll save later when debugging.  I'd recommend starting with a GUI debugger
       like "kdbg", "ddd", or "gdb" run from within emacs (see the info pages on gdb for
       instructions on how to do this).)  Optimization reorders and combines statements, removes
       unnecessary temporary variables, and generally rearranges your code so that it's very
       tough to follow inside a debugger.  The usual procedure is to write your code, compile it
       without optimization, debug it, and then turn on optimization.

       In order for the debugger to work, the compiler has to cooperate not only by not
       optimizing, but also by putting information about the names of the symbols into the object
       file so the debugger knows what things are called.  This is what the "-g" compilation
       option does.

       If you're done debugging, and you want to optimize your code, simply replace "-g" with
       "-O".  For many compilers, you can specify increasing levels of optimization by appending
       a number after "-O".  You may also be able to specify other options that increase the
       speed under some circumstances (possibly trading off with increased memory usage).  See
       your compiler's man page for details.  For example, here is an optimizing compile command
       that I use frequently with the "gcc" compiler:

           gcc -O6 -malign-double -c xyz.c -o xyz.o

       You may have to experiment with different optimization options for the absolute best
       performance.  You may need different options for different pieces of code.  Generally
       speaking, a simple optimization flag like "-O6" works with many compilers and usually
       produces pretty good results.

       Warning: on rare occasions, your program doesn't actually do exactly the same thing when
       it is compiled with optimization.  This may be due to (1) an invalid assumption you made
       in your code that was harmless without optimization, but causes problems because the
       compiler takes the liberty of rearranging things when you optimize; or (2) sadly,
       compilers have bugs too, including bugs in their optimizers.  For a stable compiler like
       "gcc" on a common platform like an Pentium, optimization bugs are seldom a problem (as of
       the year 2000--there were problems a few years ago).

       If you don't specify either "-g" or "-O" in your compilation command, the resulting object
       file is suitable neither for debugging nor for running fast.  For some reason, this is the
       default.  So always specify either "-g" or "-O".

       On some systems, you must supply "-g" on both the compilation and linking steps; on others
       (e.g. Linux), it needs to be supplied only on the compilation step.  On some systems, "-O"
       actually does something different in the linking phase, while on others, it has no effect.
       In any case, it's always harmless to supply "-g" or "-O" for both commands.

   Warnings
       Most compilers are capable of catching a number of common programming errors (e.g.,
       forgetting to return a value from a function that's supposed to return a value).  Usually,
       you'll want to turn on warnings.  How you do this depends on your compiler (see the man
       page), but with the "gcc" compiler, I usually use something like this:

           gcc -g -Wall -c xyz.c -o xyz.o

       (Sometimes I also add "-Wno-uninitialized" after "-Wall" because of a warning that is
       usually wrong that crops up when optimizing.)

       These warnings have saved me many many hours of debugging.

   Other useful compilation options
       Often, necessary include files are stored in some directory other than the current
       directory or the system include directory (/usr/include).  This frequently happens when
       you are using a library that comes with include files to define the functions or classes.

       Suppose, for example, you are writing an application that uses the Qt libraries.  You've
       installed a local copy of the Qt library in /home/users/joe/qt, which means that the
       include files are stored in the directory /home/users/joe/qt/include.  In your code, you
       want to be able to do things like this:

           #include <qwidget.h>

       instead of

           #include "/home/users/joe/qt/include/qwidget.h"

       You can tell the compiler to look for include files in a different directory by using the
       "-I" compilation option:

           g++ -I/home/users/joe/qt/include -g -c mywidget.cpp -o mywidget.o

       There is usually no space between the "-I" and the directory name.

       When the C++ compiler is looking for the file qwidget.h, it will look in
       /home/users/joe/qt/include before looking in the system include directory.  You can
       specify as many "-I" options as you want.

   Using libraries
       You will often have to tell the linker to link with specific external libraries, if you
       are calling any functions that aren't part of the standard C library.  The "-l" (lowercase
       L) option says to link with a specific library:

           cc -g xyz.o -o xyz -lm

       "-lm" says to link with the system math library, which you will need if you are using
       functions like "sqrt".

       Beware: if you specify more than one "-l" option, the order can make a difference on some
       systems.  If you are getting undefined variables when you know you have included the
       library that defines them, you might try moving that library to the end of the command
       line, or even including it a second time at the end of the command line.

       Sometimes the libraries you will need are not stored in the default place for system
       libraries.  "-labc" searches for a file called libabc.a or libabc.so or libabc.sa in the
       system library directories (/usr/lib and usually a few other places too, depending on what
       kind of Unix you're running).  The "-L" option specifies an additional directory to search
       for libraries.  To take the above example again, suppose you've installed the Qt libraries
       in /home/users/joe/qt, which means that the library files are in /home/users/joe/qt/lib.
       Your link step for your program might look something like this:

           g++ -g test_mywidget.o mywidget.o -o test_mywidget -L/home/users/joe/qt/lib -lqt

       (On some systems, if you link in Qt you will need to add other libraries as well (e.g.,
       "-L/usr/X11R6/lib -lX11 -lXext").  What you need to do will depend on your system.)

       Note that there is no space between "-L" and the directory name.  The "-L" option usually
       goes before any "-l" options it's supposed to affect.

       How do you know which libraries you need?  In general, this is a hard question, and varies
       depending on what kind of Unix you are running.  The documentation for the functions or
       classes you are using should say what libraries you need to link with.  If you are using
       functions or classes from an external package, there is usually a library you need to link
       with; the library will usually be a file called "libabc.a" or "libabc.so" or "libabc.sa"
       if you need to add a "-labc" option.

   Some other confusing things
       You may have noticed that it is possible to specify options which normally apply to
       compilation on the linking step, and options which normally apply to linking on the
       compilation step.  For example, the following commands are valid:

           cc -g -L/usr/X11R6/lib -c xyz.c -o xyz.o
           cc -g -I/somewhere/include xyz.o -o xyz

       The irrelevant options are ignored; the above commands are exactly equivalent to this:

           cc -g -c xyz.c -o xyz.o
           cc -g xyz.o -o xyz